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THERMOPHILIC DNA POLYMERASES 



FROM THERMOTOGA NEAPOLITANA 



FIELD OF THE INVENTION 

The present invention relates to thermostable DNA polymerases derived from 
the hyperthermophilic eubacteria Thermotoga neapolitana and means for isolating and 
producing the enzymes. Thermostable DNA polymerases are useful in many 
recombinant DNA techniques, especially thermal cycle sequencing and nucleic acid 
amplification. 

BACKGROUND 

Thermophilic bacteria are organisms which are capable of growth at elevated 
temperatures. Unlike the mesophiles, which grow best at temperatures in the range of 
25-40°C, or psychrophiles, which grow best at temperatures in the range of 15-20°C, 
thermophiles grow best at temperatures greater than 50°C Indeed, some thermophiies . 
grow best at 65-75°C, and some of the hyperthermophiles grow at temperatures up to 
130°C (See e.g., J.G. Black, Microbiology Principles and Applications , 2d edition, 
Prentice Hall, New Jersey, [1993] p. 145-146). 

The thermophilic bacteria encompass a wide variety of genera and species. 
There are thermophilic representatives included within the phototrophic bacteria (e.g., 
the purple bacteria, green bacteria, and cyanobacteria), eubacteria (e.g., Bacillus, 
Clostridium, Thiobacillus, Desulfotomaculum y Thermus, lactic acid bacteria, 
actinomycetes, spirochetes, and numerous other genera), and the archaebacteria (e.g., 
Pyrococcus, Thermococcus, Thermoplasma, Thermotoga, Sulfolobus, and the 
methanogens). There are aerobic, as well as anaerobic thermophilic organisms. Thus, 
the environments in which thermophiles may be isolated vary greatly, although all of 
these organisms are always isolated from areas associated with high temperatures. 
Natural geothermal habitats have a worldwide distribution and are primarily associated 
with tectonically active zones where major movements of the earth's crust occur. 



Thermophilic bacteria have been isolated from all of the various geothermal habitats, 
including boiling springs with neutral pH ranges, sulfur-rich acidic springs, and 
deep-sea vents. For all of these organisms, it appears that the organisms present in 
these geothermal habitats are optimally adapted to the temperatures at which they are 
5 living (T.D. Brock, "Introduction: An overview of the thermophiles," in T.D. Brock 

(ed.), Thermophiles: General, Molecular and Applied Microbiology . John Wiley & 
Sons, New York [1986], pp. 1-16). Basic as well as applied research on thermophiles 
has provided some insight into the physiology of these organisms, as well as promise 
for use of these organisms in industry and biotechnology. 

10 I. The Genus Thermotoga 

The Thermotoga is a recently described genus with three recognized species, 
which includes the most extremely thermophilic eubacteria known. The genus was 
first described in 1986, by Huber et al, (R. Huber et al Arch. Microbiol. 144:324 
[1986]; and Int. J. Syst Bacteriol., 36:575 [1986]). At this time, there was only one 
1 5 species described, T. maritima. T. neapolitana was first described by Jannasch et al 

in 1986 (Jannasch et al, Arch. Microbiol., 150:103-104 [1986]; and Int. J. Syst. 
Bacteriol., 39:93 [1989]). T. thermarum was described by Windberger et al in 1989 
(Windberger et al, Arch. Microbiol., 151:506-512; and Int. J. Syst. Bacteriol, 42:327 
[1992]). 

20 These organisms were originally isolated from geothermally heated marine 

sediments and hot springs. For example, T. maritima has been isolated from 
geothermally heated sea floors in Italy, the Azores, Indonesia, and Iceland as well as 
from continental, solfataric springs in Africa. T neapolitana has been isolated from a 
submarine thermal vent near Naples and from continental, solfataric springs in Africa. 

25 Members of the genus Thermotoga are considered to be hyperthermophilic, as 

they are capable of growth at temperatures up to 90°C, although growth will occur at 
temperatures between 55°C and 90°C; the optimum growth temperature is between 
70-80°C. The Thermotoga are strictly anaerobic, non-sporing, Gram-negative rods, 
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that ferment carbohydrates, and may be motile by polar, lateral or peritrichous flagella, 
although some strains are non-motile. The cells are surrounded by a sheath-like outer 
structure which usually balloons over the ends. In all species of Thermotoga, 1-4 cells 
may be enclosed within one sheath. 

5 T maritima, T. neapolitana, and T. thermarum are distinct species as judged by 

the numerous differences. For example, T. maritima possesses a single subpolar 
flagellum and is motile, while motile strains of T neapolitana possess peritrichous 
flagella (some strains are non-motile), and T. thermarum possesses lateral flagella. In 
addition, T. neapolitana will grow in NaCl concentrations ranging from 0.25 to 6.0%, 

10 while T. maritima will grow in NaCl concentrations ranging from 0.25 to 3.75%, and 

T thermarum will grow in NaCl concentrations ranging from 0.2-0.55% (J.G. Holt et 
al. (eds.), Bergey's Manual® of Determinative Bacteriology, 9th ed., Williams & 
Wilkins, Baltimore, [1994], p. 333). Also, there are differences in the susceptibility 
of these species to rifampicin, and differences in the inhibitory effects of hydrogen and 

15 sulfur on these species. Furthermore, the rate of growth at optimum growth 

temperature (80°C) is a doubling time of about 45 min for T. neapolitana and about 
75 min for T. maritima. The G+C content of the DNA of T. maritima and T. 
neapolitana is 46% and 41%, respectively. The DNA from T. maritima and T. 
neapolitana shows only about 25-30% homology by DNA-DNA hybridization studies. 

20 A few of the enzymes of the Thermotoga and other thermophilic genera have 

been studied in varying degrees of detail. As discussed below, the use of thermophilic 
enzymes in industry has been viewed as providing advantages over the use of 
mesophilic enzymes. 

II. Uses For Thermophilic Enzymes 

25 Advances in molecular biology and industrial processes have led to an 

increased interest in thermophilic organisms such as Thermotoga. Of particular 
interest has been the development of thermophilic enzymes for use in industrial 
processes such as the detergent, flavor-enhancing, and starch industries. Indeed, the 
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cost savings associated with the longer storage stability and higher activity at higher 
temperatures of thermophilic enzymes, as compared to mesophilic enzymes, provide 
good reason to select and develop thermophilic enzymes for industrial and 
biotechnology applications. Thus, there has been much research conducted to 
characterize enzymes from thermophilic organisms. However, some thermophilic 
enzymes have less activity than their mesophilic counterparts under similar conditions 
at the elevated temperatures used in industry (typically temperatures in the range of 
50-100°C) (T.K. Ng and William R. Kenealy, "Industrial Applications of Thermostable 
Enzymes," in T.D. Brock (ed.), Thermophiles: General. Molecular, and Applied 
Microbiology , [1986], John Wiley & Sons, New York, pp. 197-215). Thus, the choice 
of a thermostable enzyme over a mesophilic one may not be as beneficial as originally 
assumed. Nonetheless, of the $400 million worth of enzymes sold worldwide in 1984, 
90% were thermostable enzymes used by the detergent and starch industries (Ng and 
Kenealy, at p. 206). However, much research remains to be done to characterize and 
compare thermophilic enzymes of importance in areas such as molecular biology {e.g., 
polymerases, ligases, topoisomerases, restriction endonucleases, etc.). 

III. Thermophilic DNA Polymerases 

Extensive research has been conducted on the isolation of DNA polymerases 
from mesophilic organisms such as E. coli (See e.g., Bessman et al., J. Biol. Chem. 
223:171 [1957]; Buttin and Kornberg, J. Biol. Chem. 241:5419 [1966]; and Joyce and 
Steitz, Trends Biochem. Sci., 12:288-292 [1987]). Other mesophilic polymerases have 
also been studied, such as those of Bacillus licheniformis (Stenesh and McGowan, 
Biochim. Biophys. Acta 475:32-44 [1977]; Stenesh and Roe, Biochim. Biophys. Acta 
272:156-166 [1972]); Bacillus subtilis (Low et al, J. Biol. Chem., 251:1311 [1976]; 
and Ott et al, J. Bacterid., 165:951 [1986]; Salmonella typhimurium (Harwood et al, 
J. Biol. Chem., 245:5614 [1970]; Hamilton and Grossman, Biochem., 13:1885 [1974]), 
Streptococcus pneumoniae (Lopez et al, J. Biol. Chem., 264:4255 [1989]), and 



Micrococcus luteus (Engier and Bessman, Cold Spring Harbor Symp., 43:929 [1979]), 
to name but a few. 

Somewhat less investigation has been made on the isolation and purification of 
DNA polymerases from thermophilic organisms. However, native (i.e r 
non-recombinant) and/or recombinant thermostable DNA polymerases have been 
purified from various organisms, as shown in Table 1 below. 

TABLE 1 



Polymerases Isolation From Thermophilic Organisms 



Organism 


■ Citation 




Kaledin et al., Biochem., 45:494-501 (1980); 
Biokhimiya 45:644-651 (1980). 




Chien et al, J. Bacteriol., 127:1550 (1976). 




University of Cincinnati Master* s thesis by A. 
Chien, "Purification and Characterization of DNA 
Polymerase from Thermus aquaticus" (1976). 


Thermus aquaticus 


University of Cincinnati, Master's thesis by D. B. 
Edgar, "DNA Polymerase From an Extreme 
Thermophile: Thermus aquaticus, " (1974). 


U.S. Patent No. 4,889,818* 




U.S. Patent No. 5,352,600* 




U.S. Patent No. 5,079,352* 




European Patent Pub. No. 258,017* 




PCT Pub. No. WO 94/26766* 




PCT Pub. No. WO 92/06188* 




PCT Pub. No. WO 89/06691* 


Thermotoga maritima 


PCT Pub. No. WO 92/03556* 
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Polymerases Isolation From Thermophilic Organisms 



Organism 


Citation 


Thermotoga strain FjSS3-B.l 


Simpson et al, Biochem. Cell Biol., 68:1292-1296 
(1990). 


Thermosipho qfricanus 


PCT Pub. No. 92/06200* 


Thermus thermophilus 


Myers and Geifand, Biochem., 30:7661 (1991) 


PCT Pub. No. WO 91/09950* 


PCT Pub. No. WO 91/09944* 


Bechtereva et al, Nucleic Acids Res., 17:10507 
(1989). 


Glukhov et al, Mol. Cell. Probes 4:435-443 (1990). 


Thermus thermophilus 


Carballeira et al, BioTech., 9:276-281 (1990) 


Ruttiman et al, Eur. J. Biochem., 149:41-46 (1985). 


Oshima et al, J. Biochem., 75:179-183 (1974). 


Sakapiiehi anH Yaiima Vt*A Pmr> ^7-1 AQ1 ^^Q'7A'\ 

(abstract). 


Thermus flavus 


Kaledin et al, Biochem., 46:1247-1254 (1981); 
Biokhimiya 46:1576-1584 (1981). 


PCT Pub. No. WO 94/26766* 


Thermus ruber 


Kaledin et al, Biochem., 47:1515-1521 (1982); 
Biokhimiya 47:1785-1791 (1982) 


Thermoplasma acidophilum 


Hamal et al, Eur. J. Biochem., 190:517-521 (1990). 


Forterre et al, Can. J. Microbiol., 35:228-233 
(1989). 
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Organism 


Citation 


Sulfolobus acidocaldarius 


Salhi et al, J. Mol. BioL, 209:635-641 (1989). 


Salhi et al, Biochem. Biophys. Res. Comm., 
167:1341-1347 (1990). 


Rella et al, Itai. J. Biochem., 39:83-99 (1990). 


Forterre et al, Can. J. Microbiol., 35:228-233 
(1989). 


Rossi et al, System. Appl. Microbiol., 7:337-341 
(1986). 


Klimczak et al, Nucleic Acids Res., 13:5269-5282 
(1985). 


Elie et al, Biochim. Biophys. Acta 951:261-267 
(1988). 


Bacillus caldotenax 


J. Biochem., 113:401-410 (1993). 


Bacillus stearothermophilus 


Sellmann et al, J. Bacteriol., 174:4350-4355 (1992). 


Stenesh and McGowan, Biochim. Biophys. Acta 
475:32-44 (1977). 


Stenesh and Roe, Biochim. Biophys. Acta 
272:156-166 (1972). 


Kaboev et al, J. Bacteriol., 145:21-26 (1981). 


Methanobacterium 
thermoautotropicum 


Klimczak et al, Biochem., 25:4850-4855 (1986). 


Thermococcus litoralis 


Kong et al, J. Biol. Chem. 268:1965 (1993); U.S. 
Patent No. 5,210,036*; U.S. Patent No. 5,322,785* 



TABLE 1 

Polymerases Isolation From Thermophilic Organisms 



Organism 


Citation 


Pyrococcus furiosus 


Lundberg et al, Gene 108:1 (1991) PCT Pub. WO 
92/09689 



* Herein incorporated by reference. 

Although the organisms listed in Table 1 are considered thermophiles, many are 
in the archaebacteria, a group that is evolutionary distinct from the eubacterial genus 
5 Thermotoga. 

In addition to native forms, modified forms of thermostable DNA polymerases 
having reduced or absent 5' to 3' exonuclease activity have been expressed and 
purified from T. aquaticus, T. maritima. Thermus species spsl7, Thermus species Z05, 
T. thermophilus and T. africanus [PCT Publication No. 92/06200]. 



10 IV. Uses for Thermophilic DNA Polymerases 

One application for thermostable DNA polymerases is the polymerase chain 
reaction (PCR). The PCR process is described in U.S. Patent Nos. 4,683,195 and 
4,683,202, the disclosures of which are incorporated herein by reference. Primers, 
template, nucleoside triphosphates, the appropriate buffer and reaction conditions, and 
15 polymerase are used in the PCR process, which involves denaturation of target DNA, 

hybridization of primers and synthesis of complementary strands. The extension 
product of each primer becomes a template for the production of the desired nucleic 
acid sequence. If the polymerase employed in the PCR is a thermostable enzyme, then 
polymerase need not be added after each denaturation step because heat will not 
20 destroy the polymerase activity. Use of such enzymes as Taq DNA polymerase allows 

repetitive heating/cooling cycles without the requirement of fresh enzyme at each 
cooling step. This represents a major advantage over the use of mesophilic enzymes 
such as Klenow, as fresh enzyme must be added to each individual reaction tube at 
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every cooling step. The use of Taq in PCR is disclosed in U.S. Patent No. 4,965,188, 
EP Publ. No. 258,017, and PCT Publ. No. 89/06691, herein incorporated by reference. 

In addition to PCR, Taq DNA polymerase is widely used in other molecular 
biology techniques including recombinant DNA methods. For example, various forms 
5 of Taq have been used in a combination method which utilizes PCR and reverse 

transcription {see e.g., U.S. Patent No. 5,322,770, herein incorporated by reference). 
DNA sequencing methods have also been described which utilize Taq {see e.g., U.S. 
Patent No. 5,075,216, herein incorporated by reference). 

However, Taq DNA polymerase has certain characteristics which are 

10 undesirable in PCR and other applications including the presence of 5' to 3' 

exonuclease activity. When thermostable DNA polymerases which have 5' to 3 1 
exonuclease activity {Taq, Tma, Tspsl7, TZ05, Tth and Taj) are used in the PCR 
process and other methods, a variety of undesirable results have been observed, 
including a limitation of the amount of PCR product produced, an impaired ability to 

1 5 generate long PCR products or to amplify regions containing significant secondary 

structure, the production of shadow bands or the attenuation in signal strength of 
desired termination bands during DNA sequencing, the degradation of the 5' end of 
oligonucleotide primers in the context of double-stranded primer-template complex, 
nick-translation synthesis during oligonucleotide-directed mutagenesis and the 

20 degradation of the RNA component of RNA:DNA hybrids. When utilized in a PCR 

process with double-stranded primer-template complex, the 5' to 3 1 exonuclease 
activity of a DNA polymerase may result in the degradation of the 5' end of the 
oligonucleotide primers. This activity is not only undesirable in PCR, but also in 
second-strand cDNA synthesis and sequencing processes. 

25 In the choice of enzyme for sequencing, various factors must be considered. 

For example, large quantities of the enzyme should be easy to prepare; the enzyme 
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must be stable upon storage for considerable time periods; the enzyme should accept 
all deoxy and dideoxy nucleotides and analogues as substrates with equal affinities and 
high fidelity; the polymerase action should be highly processive over nucleotide 
extensions to 1 kb and beyond, even through regions of secondary structure within the 

5 template; the activity should remain high, even in suboptimal conditions; and it should 

be inexpensive (AT. Bankier, "Dideoxy sequencing reactions using Klenow fragment 
DNA polymerase I," in H. and A. Griffin (eds.), Methods in Molecular Biology: DNA 
Sequencing Protocols, Humana Press, Totowa, NJ, [1993], pp. 83-90). Furthermore 
the enzyme should be able to function at elevated temperatures (i.e, greater than about 

10 70°C) so that non-specific priming reactions are minimized. However, there are no 

commercially available enzymes which fully meet all of these criteria. Thus, mutant 
forms of enzymes have been produced in order to address some of these needs. 

For example, mutant forms of thermostable DNA polymerases which exhibit 
reduced or absent 5' to 3' exonuclease activity have been generated. The Stoffel 

15 fragment of Taq DNA polymerase lacks 5' to 3' exonuclease activity due to genetic 
manipulations which result in the production of a truncated protein lacking the 
N-terminal 289 amino acids (See e.g., Lawyer et ai, J. Biol Chem., 264:6427-6437 
[1989]; and Lawyer et ai, PCR Meth. Appl., 2:275-287 [1993]). Analogous mutant 
polymerases have been generated for polymerases derived from T. maritima, Tspsl 7, 

20 TZ05, Tth and Taf. While the generation of thermostable polymerases lacking 5 1 to 3' 

exonuclease activity provides improved enzymes for certain applications, some of these 
mutant polymerases still have undesirable characteristics including the presence of 3 1 
to 5' exonuclease activity. 

The 3' to 5' exonuclease activity is commonly referred to as a proof-reading 

25 activity. The 3' to 5' exonuclease removes bases which are mismatched at the 3' end 

of a primer-template duplex. While the presence of 3' to 5' exonuclease activity may 
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be advantageous as it leads to an increase in the fidelity of replication of nucleic acid 
strands it also has other undesirable characteristics. The 3' to 5' exonuclease activity 
found in thermostable DNA polymerases such as Tma (including mutant forms of Tma 
that lack 5' to 3' exonuclease activity) also degrades single-stranded DNA such as the 
5 primers used in the PCR, single-stranded templates and single-stranded PCR products. 

The integrity of the 3* end of an oligonucleotide primer used in a primer extension 
process (e.g., PCR, Sanger sequencing methods, etc.) is critical as it is from this 
terminus that extension of the nascent strand begins. Degradation of the 3' end leads 
to a shortened oligonucleotide which in turn results in a loss of specificity in the 

10 priming reaction (i.e., the shorter the primer the more likely it becomes that spurious 

or non-specific priming will occur). 

The degradation of an oligonucleotide primer by a 3' exonuclease can be 
prevented by the use of modified nucleotides at the 3 ' terminus. For example, the use 
of dideoxynucleotides or deoxynucleotides having a phosphorothiolate linkage at the 3' 

15 terminus of an oligonucleotide would prevent degradation by 3' exonucleases. 

However, the need to use modified nucleotides to prevent degradation of 
oligonucleotides by 3* exonuclease increases the time and cost required to prepare 
oligonucleotide primers. 

A few examples of a thermostable polymerase which lack both 5' to 3' 

20 exonuclease and 3' to 5' exonuclease are known. As discussed above, the Stoffel 

fragment of Taq DNA polymerase lacks the 5' to 3' exonuclease activity due to 
genetic manipulation and no 3' to 5* activity is present as Taq polymerase is naturally 
lacking in 3' to 5' exonuclease activity. Likewise the Tth polymerase naturally lacks 
3' to 5' exonuclease activity and genetic deletion of N-terminal amino acids removes 

25 the 5' to 3' exonuclease activity. 
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Despite the development of recombinant enzymes such as Stoffel fragment, 
there remains a need for other thermostable polymerases having improved 
characteristics. For example, thermostable polymerases are used in Sanger 
dideoxynucleotide sequencing protocols. The most commonly used enzyme is Taq 
polymerase or a modified form of Taq polymerase. High concentrations of the 
expensive dideoxynucleotides must be used in the sequencing reaction when these 
enzymes are employed as they have a fairly low affinity for dideoxynucleotides. The 
art needs a thermostable polymerase which displays a higher affinity for 
dideoxynucleotides as this would result in considerable cost savings. In addition, the 
art needs additional thermostable polymerases having novel properties to improve the 
results obtained when using techniques such as DNA amplification, sequencing and 
nick-translation. 



SUMMARY OF THE INVENTION 

The present invention relates to purified thermostable DNA polymerases 
derived from the eubacteria Thermotoga neapolitana (Tne), Nucleic acid sequences 
encoding the full-length Tne DNA polymerase is provided. In addition, nucleic acid 
sequences encoding several modified forms of the Tne DNA polymerase are provided 
herein. The present invention provides methods for the isolation of purified 
preparations of Tne DNA polymerases. The Tne DNA polymerases may be isolated 
from Thermotoga neapolitana cells or from host cells containing nucleic sequences 
encoding a Tne DNA polymerase. 

In one embodiment, the present invention contemplates a purified thermostable 
DNA polymerase derived from the eubacterium Thermotoga neapolitana which is 
capable of DNA synthetic activity. In another embodiment, the purified Tne DNA 
polymerase has 3' exonuclease activity. In yet another embodiment, the purified Tne 
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DNA polymerase has 5' exonuclease activity. In one preferred embodiment, the 
purified Tne DNA polymerase comprises the amino acid sequence of SEQ ID NO:2. 
In a particularly preferred embodiment, the specific activity of the synthetic activity 
of the purified Tne DNA polymerase is approximately 100,000 units/mg. 
5 In another embodiment, the purified thermostable Tne DNA polymerase is a 

non-naturally occurring or recombinant DNA polymerase. The recombinant Tne DNA 
polymerase may further contain 3' exonuclease activity and/or 5' exonuclease activity. 

In a preferred embodiment, the non-naturally occurring Tne DNA polymerase 
has reduced levels of 3' exonuclease activity. In another embodiment, the 

10 non-naturally occurring Tne DNA polymerase lacks significant 5' exonuclease activity. 

In a particularly preferred embodiment, the non-naturally occurring Tne DNA 
polymerase comprises an amino acid sequence selected from the group consisting of 
SEQ ID NOS: 8, 16, 19, 23, 26, 29, 33 and 35. 

The present invention provides nucleic acid sequences encoding thermostable 

15 DNA polymerases. In a preferred embodiment, an oligonucleotide comprising the 

nucleic acid sequence of SEQ ID NO:l encodes the thermostable DNA polymerase. 
These nucleic acid sequences encoding thermostable DNA polymerases may be 
modified to encode a polymerase which lacks significant 5* exonuclease activity. In a 
preferred embodiment, the modified nucleic acid sequences encoding a thermostable 

20 DNA polymerase comprise the nucleotide sequence of SEQ ID NO:7. In one 

embodiment, the polymerase encoded by the modified nucleic acid sequences displays 
reduced levels of 3' exonuclease activity. In a particularly preferred embodiment, the 
modified nucleic acid sequences encoding a polymerase having reduced levels of 3' 
exonuclease activity are selected from the group consisting of SEQ ID NOS:7, 15, 18, 

25 22, 25, 28, 32 and 34. 
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The present invention provides recombinant DNA vectors containing nucleic 
acid sequences which encode a thermostable DNA polymerase having DNA synthetic 
activity. In a preferred embodiment the polymerase-encoding nucleic acid sequences 
are set forth in SEQ ID NO:l. The recombinant DNA vector may contain a modified 

5 nucleic sequence encoding a thermostable DNA polymerase which lacks significant 5' 

exonuclease activity. In a preferred embodiment, the recombinant DNA vector 
contains a modified nucleic acid sequence which comprises SEQ ID NO: 7. 

In a preferred embodiment, the recombinant DNA vector contains modified 
nucleic acid sequences encoding a thermostable DNA polymerase which exhibits 

10 reduced levels 3' exonuclease activity. In a particularly preferred embodiment, the 

modified sequences encoding a thermostable DNA polymerase which exhibits reduced 
levels 3' exonuclease activity are selected from the group consisting of SEQ ID 
NOS:7, 15, 18, 22, 25, 28, 32 and 34. 

The present invention further contemplates the transformation of host cells with 

15 the recombinant DNA vectors containing nucleic acid sequences encoding Tne DNA 
polymerases. The invention is not limited by the choice of host cell; host cells may 
comprise procaryotic or eucaryotic cells. In a preferred embodiment, the host cell is 
an E. coli host cell. 

The invention further provides methods for detenmning the DNA sequence of a 
20 segment or portion of a DNA molecule using the Tne DNA polymerases of the 

invention. Dideoxy nucleotide (ddNTP) chain termination sequencing protocols are 
used in conjunction with the polymerases of the invention. Traditional (i.e., Sanger) as 
well as other methods, including but not limited to, chain termination sequencing or 
thermal cycle sequencing protocols benefit from the use of the Tne DNA polymerases 
25 of the invention. The claimed Tne DNA polymerases have a high affinity for 

dideoxynucleotides; accordingly the following ratios of dNTPs and ddNTPs are 
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contemplated for use in either thermal cycling or Sanger chain termination protocols 
when Tne DNA polymerases are employed: dATP:ddATP is 1:1.67 ± 50%; 
dCTP:ddCTP is 1:0.83 ± 50%; dGTP:ddGTP is 1:0.67 ± 50% and TTP:ddTTP is 
1:2.5 ± 50% where each dNTP is present at a final concentration of about 1 uM to 
120 uM 

DESCRIPTION OF THE DRAWINGS 

Figure 1 provides a schematic representation of the 5 1 exonuclease, 3' 
exonuclease and polymerase domains in several DNA polymerases. 

Figure 2 shows an alignment of amino acid residues from three regions within 
the 3' exonuclease domain of selected DNA polymerases. 

Figure 3 shows the alignment of the amino acid residues (using the one letter 
code for the amino acids) from E. coli DNA polymerase I, Tma DNA polymerase and 
Tne DNA polymerase. 

Figure 4 provides a schematic depicting the full length and mutant Tne DNA 
polymerases of the present invention. 

Figure 5 A shows an autoradiograph of a sequencing gel. 

Figure 5B shows an autoradiograph of a sequencing gel. 

Figure 6 A shows an autoradiograph of a sequencing gel. 

Figure 6B shows an autoradiograph of a sequencing gel. 

Figure 6C shows an autoradiograph of a sequencing gel. 
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DEFINITIONS 

To facilitate understanding of the invention, a number of terms are defined 

below. 

The term "gene" refers to a DNA sequence that comprises control and coding 
5 sequences necessary for the production of a polypeptide or precursor. The polypeptide 

can be encoded by a full length coding sequence or by any portion of the coding 
sequence so long as the desired enzymatic activity is retained. 

The term "wild-type" refers to a gene or gene product which has the 
characteristics of that gene or gene product when isolated from a naturally occurring 
10 source. A wild-type gene is that which is most frequently observed in a population 
and is thus arbitrarily designated the "normal" or "wild-type" form of the gene. In 
contrast, the term "modified" or "mutant" refers to a gene or gene product which 
displays modifications in sequence and or functional properties (i.e., altered 
characteristics) when compared to the wild-type gene or gene product. It is noted that 
1 5 naturally-occurring mutants can be isolated; these are identified by the fact that they 
have altered characteristics when compared to the wild-type gene or gene product. 
The wild-type form of the coding region for the Tne DNA polymerase is listed in SEQ 
ID NO:l; the wild-type form of the Tne DNA polymerase protein is listed in SEQ ID 
NO:2. The Tne DNA polymerase proteins encoded by "modified" or "mutant" genes 
20 are referred to as non-naturally occurring Tne DNA polymerases. 

The term "recombinant DNA vector" as used herein refers to DNA sequences 
containing a desired coding sequence and appropriate DNA sequences necessary for 
the expression of the operably linked coding sequence in a particular host organism. 
DNA sequences necessary for expression in procaryotes include a promoter, optionally 
25 an operator sequence, a ribosome binding site and possibly other sequences. 

Eukaryotic cells are known to utilize promoters, polyadenlyation signals and enhancers. 
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As used herein, the terms "cell," "cell line," and cell culture" are used 
interchangeably and all such designations include progeny. The words "transformants" 
or "transformed cells" include the primary transformed ceil and cultures derived from 
that cell without regard to the number of transfers. All progeny may not be precisely 
5 identical in DNA content, due to deliberate or inadvertent mutations. Mutant progeny 

that have the same functionality as screened for in the originally transformed cell are 
included in the definition of transformants. 

As used herein, the term "vector" is used in reference to nucleic acid molecules 
that transfer DNA segment(s) from one cell to another. The term "vehicle" is 
10 sometimes used interchangeably with "vector." 

The term "expression vector" as used herein refers to a recombinant DNA 
molecule containing a desired coding sequence and appropriate nucleic acid sequences 
necessary for the expression of the operably linked coding sequence in a particular 
host organism. Nucleic acid sequences necessary for expression in procaryotes usually 
15 include a promoter, an operator (optional), and a ribosome binding site, often along 

with other sequences. Eucaryotic cells are known to utilize promoters, enhancers, and 
termination and polyadenylation signals. 

The terms "in operable combination", "in operable order" and "operably linked" 
as used herein refer to the linkage of nucleic acid sequences in such a manner that a 
20 nucleic acid molecule capable of directing the transcription of a given gene and/or the 

synthesis of a desired protein molecule is produced. The term also refers to the 
linkage of amino acid sequences in such a manner so that a functional protein is 
produced. 

The term "transfection" as used herein refers to the introduction of foreign 
25 DNA into eucaryotic cells. Transfection may be accomplished by a variety of means 

known to the art including calcium phosphate-DNA co-precipitation, 
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DEAE-dextran-mediated transfection, polybrene-mediated transfection, electroporation, 
microinjection, liposome fusion, lipofection, protoplast fusion, retroviral infection, and 
biolistics. 

As used herein, the terms "complementary" or "complementarity" are used in 
5 reference to polynucleotides (i.e., a sequence of nucleotides) related by the 

base-pairing rules. For example, for the sequence "A-G-T," is complementary to the 
sequence "T-C-A." Complementary may be "partial," in which only some of the 
nucleic acids' bases are matched according to the base pairing rules. Or, there may be 
"complete" or "total" complementary between the nucleic acids. The degree of 

10 complementary between nucleic acid strands has significant effects on the efficiency 

and strength of hybridization between nucleic acid strands. This is of particular 
importance in amplification reactions, as well as detection methods which depend upon 
binding between nucleic acids. 

The term "homology" refers to a degree of complementary. There may be 

15 partial homology or complete homology (i.e., identity). A partially complementary 

sequence is one that at least partially inhibits a completely complementary sequence 
from hybridizing to a target nucleic acid is referred to using the functional term 
"substantially homologous." The inhibition of hybridization of the completely 
complementary sequence to the target sequence may be examined using a hybridization 

20 assay (Southern or Northern blot, solution hybridization and the like) under conditions 

of low stringency. A substantially homologous sequence or probe will compete for 
and inhibit the binding (i.e., the hybridization) of a completely homologous to a target 
under conditions of low stringency. This is not to say that conditions of low 
stringency are such that non-specific binding is permitted; low stringency conditions 

25 require that the binding of two sequences to one another be a specific (i.e., selective) 

interaction. The absence of non-specific binding may be tested by the use of a second 
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target which lacks even a partial degree of complementarity (e.g., less than about 30% 
identity); in the absence of non-specific binding the probe will not hybridize to the 
second non-complementary target. 

Low stringency conditions comprise conditions equivalent to binding or 

5 hybridization at 42°C in a solution consisting of 5X SSPE (43.8 g/1 NaCl, 6.9 g/1 

NaH 2 PO/H 2 0 and 1.85 g/1 EDTA, pH adjusted to 7.4 with NaOH), 0.1% SDS, 5X 
Denhardt's reagent [50X Denhardt's contains per 500 ml: 5 g Ficoll (Type 400, 
Pharamcia), 5 g BSA (Fraction V; Sigma)] and 100 /xg/ml denatured salmon sperm 
DNA followed by washing in a solution comprising 5X SSPE, 0.1% SDS at 42°C 

10 when a probe of about 500 nucleotides in length is employed. 

The art knows well that numerous equivalent conditions may be employed to 
comprise low stringency conditions; factors such as the length and nature (DNA, RNA, 
base composition) of the probe and nature of the target ( DNA, RNA, base 
composition, present in solution or immobilized, etc.) and the concentration of the salts 

15 and other components (e.g., the presence or absence of formamide, dextran sulfate, 

polyethylene glycol) are considered and the hybridization solution may be varied to 
generate conditions of low stringency hybridization different from, but equivalent to, 
the above listed conditions. In addition, the art knows conditions which promote 
hybridization under conditions of high stringency (e.g., increasing the temperature of 

.20 the hybridization and/or wash steps, the use of formamide in the hybridization 

solution, etc.). 

When used in reference to a double-stranded nucleic acid sequence such as a 
cDNA or genomic clone, the term "substantially homologous" refers to any probe 
which can hybridize to either or both strands of the double-stranded nucleic acid 
25 sequence under conditions of low stringency as described above. 
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When used in reference to a single-stranded nucleic acid sequence, the term 
"substantially homologous" refers to any probe which can hybridize {i.e., it is the 
complement of) the single-stranded nucleic acid sequence under conditions of low 
stringency as described above. 

5 As used herein, the term "hybridization" is used in reference to the pairing of 

complementary nucleic acids. Hybridization and the strength of hybridization {i.e., the 
strength of the association between the nucleic acids) is impacted by such factors as 
the degree of complementary between the nucleic acids, stringency of the conditions 
involved, the T m of the formed hybrid, and the G:C ratio within the nucleic acids. 

10 As used herein, the term "T m " is used in reference to the "melting temperature." 

The melting temperature is the temperature at which a population of double-stranded 
nucleic acid molecules becomes half dissociated into single strands. The equation for 
calculating the T m of nucleic acids is well known in the art. As indicated by standard 
references, a simple estimate of the T m value may be calculated by the equation: T ra - 

15 81.5 + 0.4 1(% G + C), when a nucleic acid is in aqueous solution at 1 M NaCl {see 

e.g., Anderson and Young, Quantitative Filter Hybridization, in Nucleic Acid 
Hybridization (1985). Other references include more sophisticated computations which 
take structural as well as sequence characteristics into account for the calculation of 
T . 

20 As used herein the term "stringency" is used in reference to the conditions of 

temperature, ionic strength, and the presence of other compounds such as organic 
solvents, under which nucleic acid hybridizations are conducted. With "high 
stringency" conditions, nucleic acid base pairing will occur only between nucleic acid 
fragments that have a high frequency of complementary base sequences. Thus, 

25 conditions of "weak" or "low" stringency are often required with nucleic acids that are 
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derived from organisms that are genetically diverse, as the frequency of 
complementary sequences is usually less. 

As used herein, the term "amplifiable nucleic acid" is used in reference to 
nucleic acids which may be amplified by any amplification method. It is contemplated 
5 that "amplifiable nucleic acid" will usually comprise "sample template." 

As used herein, the term "sample template" refers to nucleic acid originating 
from a sample which is analyzed for the presence of "target" (defined below). In 
contrast, "background template" is used in reference to nucleic acid other than sample 
template which may or may not be present in a sample. Background template is most 

10 often inadvertent. It may be the result of carryover, or it may be due to the presence 

of nucleic acid contaminants sought to be purified away from the sample. For 
example, nucleic acids from organisms other than those to be detected may be present 
as background in a test sample. 

As used herein, the term "primer" refers to an oligonucleotide, whether 

15 occurring naturally as in a purified restriction digest or produced synthetically, which 

is capable of acting as a point of initiation of synthesis when placed under conditions 
in which synthesis of a primer extension product which is complementary to a nucleic 
acid strand is induced, (z.e., in the presence of nucleotides and an inducing agent such 
as DNA polymerase and at a suitable temperature and pH). The primer is preferably 

20 single stranded for maximum efficiency in amplification, but may alternatively be 

double stranded. If double stranded, the primer is first treated to separate its strands 
before being used to prepare extension products. Preferably, the primer is an 
oligodeoxyribonucleotide. The primer must be sufficiently long to prime the synthesis 
of extension products in the presence of the inducing agent. The exact lengths of the 

25 primers will depend on many factors, including temperature, source of primer and the 

use of the method. 
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As used herein, the term "probe" refers to an oligonucleotide (i.e., a sequence 
of nucleotides), whether occurring naturally as in a purified restriction digest or 
produced synthetically, recombinantly or by PCR amplification, which is capable of 
hybridizing to another oligonucleotide of interest. A probe may be single-stranded or 
5 double-stranded. Probes are useful in the detection, identification and isolation of 

particular gene sequences. It is contemplated that any probe used in the present 
invention will be labeled with any "reporter molecule," so that is detectable in any 
detection system, including, but not limited to enzyme (e.g., ELISA, as well as 
enzyme-based histochemical assays), fluorescent, radioactive, and luminescent systems. 
10 It is not intended that the present invention be limited to any particular detection 

system or label. 

As used herein, the term "target," when used in reference to the polymerase 
chain reaction, refers to the region of nucleic acid bounded by the primers used for 
polymerase chain reaction. Thus, the "target" is sought to be sorted out from other 
1 5 nucleic acid sequences. A "segment" is defined as a region of nucleic acid within the 

target sequence. 

As used herein, the term "polymerase chain reaction" ("PCR") refers to the 
method of K.B. Mullis U.S. Patent Nos. 4,683,195 and 4,683,202, hereby incorporated 
by reference, which describe a method for increasing the concentration of a segment of 

20 a target sequence in a mixture of genomic DNA without cloning or purification. This 

process for amplifying the target sequence consists of introducing a large excess of 
two oligonucleotide primers to the DNA mixture containing the desired target 
sequence, followed by a precise sequence of thermal cycling in the presence of a DNA 
polymerase. The two primers are complementary to their respective strands of the 

25 double stranded target sequence. To effect amplification, the mixture is denatured and 

the primers then annealed to their complementary sequences within the target 
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molecule. Following annealing, the primers are extended with a polymerase so as to 
form a new pair of complementary strands. The steps of denaturation, primer 
annealing and polymerase extension can be repeated many times (i.e., denaturation, 
annealing and extension constitute one "cycle"; there can be numerous "cycles") to 
obtain a high concentration of an amplified segment of the desired target sequence. 
The length of the amplified segment of the desired target sequence is determined by 
the relative positions of the primers with respect to each other, and therefore, this 
length is a controllable parameter. By virtue of the repeating aspect of the process, the 
method is referred to as the "polymerase chain reaction" (hereinafter "PCR"). Because 
the desired amplified segments of the target sequence become the predominant 
sequences (in terms of concentration) in the mixture, they are said to be "PCR 
amplified". 

With PCR, it is possible to amplify a single copy of a specific target sequence 
in genomic DNA to a level detectable by several different methodologies (e.g., 
hybridization with a labeled probe; incorporation of biotinylated primers followed by 
avidin-enzyme conjugate detection; incorporation of 32 P-labeled deoxynucleotide 
triphosphates, such as dCTP or dATP, into the amplified segment). In addition to 
genomic DNA, any oligonucleotide sequence can be amplified with the appropriate set 
of primer molecules. In particular, the amplified segments created by the PCR process 
itself are, themselves, efficient templates for subsequent PCR amplifications. 

"Amplification" is a special case of nucleic acid replication involving template 
specificity. It is to be contrasted with non-specific template replication (i.e., 
replication that is template-dependent but not dependent on a specific template). 
Template specificity is here distinguished from fidelity of replication (i.e., synthesis of 
the proper polynucleotide sequence) and nucleotide (ribo- or deoxyribo-) specificity. 
Template specificity is frequently described in terms of "target" specificity. Target 
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sequences are "targets" in the sense that they are sought to be sorted out from other 
nucleic acid. Amplification techniques have been designed primarily for this sorting 

out. 

Template specificity is achieved in most amplification techniques by the choice 

5 of enzyme. Amplification enzymes are enzymes that, under conditions they are used, 

will process only specific sequences of nucleic acid in a heterogeneous mixture of 
nucleic acid. For example, in the case of QP replicase, MDV-1 RNA is the specific 
template for the replicase [D.L. Kacian et ai, Proc. Natl Acad. Set USA 69:3038 
(1972)]. Other nucleic acid will not be replicated by this amplification enzyme. 

10 Similarly, in the case of T7 RNA polymerase, this amplification enzyme has a 

stringent specificity for its own promoters [M. Chamberlin et al> Nature 228:227 
(1970)]. In the case of T4 DNA ligase, the enzyme will not ligate the two 
oligonucleotides where there is a mismatch between the oligonucleotide substrate and 
the template at the ligation junction [D.Y. Wu and R. B. Wallace, Genomics 4:560 

15 (1989)]. Finally, Taq and Pfu polymerases, by virtue of their ability to function at 

high temperature, are found to display high specificity for the sequences bounded and 
thus defined by the primers; the high temperature results in thermodynamic conditions 
that favor primer hybridization with the target sequences and not hybridization with 
non-target sequences [PCR Technology, H.A. Erlich (ed.) (Stockton Press 1989)]. 

20 As used herein, the terms "PCR product", "PCR fragment" and "amplification 

product" refer to the resultant mixture of compounds after two or more cycles of the 
PCR steps of denaturation, annealing and extension are complete. These terms 
encompass the case where there has been amplification of one or more segments of 
one or more target sequences. 

25 As used herein, the term "amplification reagents" refers to those reagents 

(deoxyribonucleotide triphosphates, buffer, etc.), needed for amplification except for 
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primers, nucleic acid template and the amplification enzyme. Typically, amplification 
reagents along with other reaction components are placed and contained in a reaction 
vessel (test tube, microwell, etc.). 

As used herein, the terms "restriction endonucleases" and "restriction enzymes" 
5 refer to bacterial enzymes, each of which cut double-stranded DNA at or near a 

specific nucleotide sequence. 

As used herein, the term "recombinant DNA molecule" as used herein refers to 
a DNA molecule which is comprised of segments of DNA joined together by means of 
molecular biological techniques. 

10 DNA molecules are said to have "5' ends" and "3* ends" because 

mononucleotides are reacted to make oligonucleotides in a manner such that the 5' 
phosphate of one mononucleotide pentose ring is attached to the 3' oxygen of its 
neighbor in one direction via a phosphodiester linkage. Therefore, an end of an 
oligonucleotides referred to as the "5' end" if its 5' phosphate is not linked to the 3' 

15 oxygen of a mononucleotide pentose ring and as the 'T end" if its 3' oxygen is not 

linked to a 5' phosphate of a subsequent mononucleotide pentose ring. As used 
herein, a nucleic acid sequence, even if internal to a larger oligonucleotide, also may 
be said to have 5' and 3' ends. In either a linear or circular DNA molecule, discrete 
elements are referred to as being "upstream" or 5' of the "downstream" or 3' elements. 

20 This terminology reflects the fact that transcription proceeds in a 5' to 3' fashion along 

the DNA strand. The promoter and enhancer elements which direct transcription of a 
linked gene are generally located 5' or upstream of the coding region However, 
enhancer elements can exert their effect even when located 3' of the promoter element 
and the coding region. Transcription termination and polyadenylation signals are 

25 located 3' or downstream of the coding region. 
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As used herein, the term "an oligonucleotide having a nucleotide sequence 
encoding a gene" means a nucleic acid sequence comprising the coding region of a 
gene or in other words the nucleic acid sequence which encodes a gene product. The 
coding region may be present in either a cDNA, genomic DNA or RNA form. When 
5 present in a DNA form, the oligonucleotide may be single-stranded (i.e., the sense 

strand) or double-stranded. Suitable control elements such as enhancers/promoters, 
splice junctions, polyadenylation signals, etc, may be placed in close proximity to the 
coding region of the gene if needed to permit proper initiation of transcription and/or 
correct processing of the primary RNA transcript. Alternatively, the coding region 
10 utilized in the expression vectors of the present invention may contain endogenous 

enhancers/promoters, splice junctions, intervening sequences, polyadenylation signals, 
etc, or a combination of both endogenous and exogenous control elements. 

As used herein, the term "regulatory element" refers to a genetic element which 
controls some aspect of the expression of nucleic acid sequences. For example, a 
1 5 promoter is a regulatory element which facilitates the initiation of transcription of an 

operably linked coding region. Other regulatory elements are splicing signals, 
polyadenylation signals, termination signals, etc. (defined infra). 

Transcriptional control signals in eucaryotes comprise "promoter" and 
"enhancer" elements. Promoters and enhancers consist of short arrays of DNA 
20 sequences that interact specifically with cellular proteins involved in transcription 

[Maniatis, T. et aL, Science 236:1237 (1987)]. Promoter and enhancer elements have 
been isolated from a variety of eucaryotic sources including genes in yeast, insect and 
mammalian cells and viruses (analogous control elements, i.e., promoters, are also 
found in procaryotes). The selection of a particular promoter and enhancer depends on 
25 what cell type is to be used to express the protein of interest. Some eucaryotic 

promoters and enhancers have a broad host range while others are functional in a 
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limited subset of cell types [for review see Voss, S.D. et al, Trends Biochem. Sci., 
11:287 (1986) and Maniatis, T. et al, supra (1987)]. For example, the SV40 early 
gene enhancer is very active in a wide variety of cell types from many mammalian 
species and has been widely used for the expression of proteins in mammalian cells 

5 [Dijkema, EL et al, EMBO J. 4:761 (1985)]. Two other examples of 

promoter/enhancer elements active in a broad range of mammalian cell types are those 
from the human elongation factor la gene [Uetsuki, T. et al, J. Biol Chem., 
264:5791 (1989), Kim, D.W. et al, Gene 91:217 (1990) and Mizushima, S. and 
Nagata, S., Nuc. Acids. Res., 18:5322 (1990)] and the long terminal repeats of the 

10 Rous sarcoma virus [Gorman, CM. et al, Proc. Natl Acad. ScL USA 79:6777 (1982)] 

and the human cytomegalovirus [Boshart, M. et al, Cell 41:521 (1985)]. 

As used herein, the term "promoter/enhancer" denotes a segment of DNA 
which contains sequences capable of providing both promoter and enhancer functions 
(i.e., the functions provided by a promoter element and an enhancer element, see 

15 above for a discussion of these functions). For example, the long terminal repeats of 

retroviruses contain both promoter and enhancer functions. The enhancer/promoter 
may be "endogenous" or "exogenous" or "heterologous." An "endogenous" 
enhancer/promoter is one which is naturally linked with a given gene in the genome. 
An "exogenous" or "heterologous" enhancer/promoter is one which is placed in 

20 juxtaposition to a gene by means of genetic manipulation (i.e., molecular biological 
techniques) such that transcription of that gene is directed by the linked 
enhancer/promoter. 

The presence of "splicing signals" on an expression vector often results in 
higher levels of expression of the recombinant transcript in eucaryotic host cells. 

25 Splicing signals mediate the removal of introns from the primary RNA transcript and 

consist of a splice donor and acceptor site [Sambrook, J. et al , Molecular Cloning: A 
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Laboratory Manual, 2nd ed., Cold Spring Harbor Laboratory Press, New York (1989) 
pp. 16.7-16.8]. A commonly used splice donor and acceptor site is the splice junction 
from the 16S RNA of SV40. 

Efficient expression of recombinant DNA sequences in eucaryotic cells requires 
5 expression of signals directing the efficient termination and polyadenylation of the 

resulting transcript. Transcription termination signals are generally found downstream 
of the polyadenylation signal and are a few hundred nucleotides in length. The term 
"poly A site" or "poly A sequence" as used herein denotes a DNA sequence which 
directs both the termination and polyadenylation of the nascent RNA transcript. 

10 Efficient polyadenylation of the recombinant transcript is desirable as transcripts 

lacking a poly A tail are unstable and are rapidly degraded. The poly A signal utilized 
in an expression vector may be "heterologous" or "endogenous." An endogenous poly 
A signal is one that is found naturally at the 3' end of the coding region of a given 
gene in the genome. A heterologous poly A signal is one which is one which is 

15 isolated from one gene and placed 3' of another gene. A commonly used heterologous 

poly A signal is the SV40 poly A signal. The SV40 poly A signal is contained on a 
237 bp BamKL/Bcll restriction fragment and directs both termination and 
polyadenylation [J. Sambrook, supra, at 16.6-16.7]. 

Eucaryotic expression vectors may also contain "viral replicons "or "viral 

20 origins of replication." Viral replicons are viral DNA sequences which allow for the 

extrachromosomal replication of a vector in a host cell expressing the appropriate 
replication factors. Vectors which contain either the SV40 or polyoma virus origin of 
replication replicate to high copy number (up to 10 4 copies/cell) in cells that express 
the appropriate viral T antigen. Vectors which contain the replicons from bovine 

25 papillomavirus or Epstein-Barr virus replicate extrachromosomal ly at low copy number 

(-100 copies/cell). 
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The Tne polymerases may be expressed in either procaryotic or eucaryotic host 
cells. Nucleic acid encoding the Tne polymerase may be introduced into bacterial host 
cells by a number of means including transformation of bacterial cells made competent 
for transformation by treatment with calcium chloride or by electroporation. If the Tne 
5 polymerases are to be expressed in eucaryotic host cells, nucleic acid encoding the Tne 

polymerase may be introduced into eucaryotic host cells by a number of means 
including calcium phosphate co-precipitation, spheroplast fusion, electroporation and 
the like. When the eucaryotic host cell is a yeast cell, transformation may be affected 
by treatment of the host cells with lithium acetate or by electroporation. 

10 As used herein, the terms "nucleic acid molecule encoding," "DNA sequence 

encoding," and "DNA encoding" refer to the order or sequence of 
deoxyribonucleotides along a strand of deoxyribonucleic acid. The order of these 
deoxyribonucleotides determines the order of amino acids along the polypeptide 
(protein) chain. The DNA sequence thus codes for the amino acid sequence. 

15 The term "isolated" when used in relation to a nucleic acid, as in "an isolated 

oligonucleotide" refers to a nucleic acid sequence that is identified and separated from 
at least one contaminant nucleic acid with which it is ordinarily associated in its 
natural source. Isolated nucleic acid is such present in a form or setting that is 
different from that in which it is found in nature. In contrast, non-isolated nucleic 

20 acids as nucleic acids such as DNA and RNA found in the state they exist in nature. 

For example, a given DNA sequence (e.g., a gene) is found on the host cell 
chromosome in proximity to neighboring genes; RNA sequences, such as a specific 
mRNA sequence encoding a specific protein, are found in the cell as a mixture with 
numerous other mRNA s which encode a multitude of proteins. However, isolated 

25 nucleic acid encoding a Tne polymerase includes, by way of example, such nucleic 

acid in cells ordinarily expressing a Tne polymerase where the nucleic acid is in a 
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chromosomal location different from that of natural cells, or is otherwise flanked by a 
different nucleic acid sequence than that found in nature. The isolated nucleic acid or 
oligonucleotide may be present in single-stranded or double-stranded form. When an 
isolated nucleic acid or oligonucleotide is to be utilized to express a protein, the 
5 oligonucleotide will contain at a minimum the sense or coding strand (i.e., the 

oligonucleotide may single-stranded), but may contain both the sense and anti-sense 
strands (i.e., the oligonucleotide may be double-stranded). 

As used herein the term "coding region" when used in reference to structural 
gene refers to the nucleotide sequences which encode the amino acids found in the 
10 nascent polypeptide as a result of translation of a mRNA molecule. The coding region 

is bounded on the 5' side by the nucleotide triplet "ATG M which encodes the initiator 
methionine and on the 3' side by one of the three triplets which specify stop codons 
(i.e., TAA, TAG, TGA). 

As used herein, the term "purified" or "to purify" refers to the removal of 
15 contaminants from a sample. For example, recombinant Tne DNA polymerases are 

expressed in bacterial host cells and the polymerases are purified by the removal of 
host cell proteins; the percent of recombinant Tne DNA polymerase is thereby 
increased in the sample. 

The term "recombinant DNA molecule" as used herein refers to a DNA 
20 molecule which is comprised of segments of DNA joined together by means of 

molecular biological techniques. 

The term "recombinant protein" or "recombinant polypeptide" as used herein 
refers to a protein molecule which is expressed from a recombinant DNA molecule. 
The term "native protein" is used herein to indicate a protein isolated from a 
25 naturally occurring (i.e., a nonrecombinant) source. Molecular biological techniques 

may be used to produce a recombinant form of a protein which has identical properties 
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when compared to the native form of the protein. The term "vTne" is used to 
designate a recombinant form of Tne polymerase. The terms "nTne" and "nTaq" are 
used to designate the native forms of Trie polymerase and Taq polymerase, 
respectively. 

5 As used herein the term "portion" when in reference to a protein (as in "a 

portion of a given protein") refers to fragments of that protein. The fragments may 
range in size from four amino acid residues to the entire amino acid sequence minus 
one amino acid. 

As used herein, the term "fusion protein" refers to a chimeric protein containing 
10 the protein of interest (i.e., Tne DNA polymerases and fragments thereof) joined to an 

exogenous protein fragment (the fusion partner which consists of a non-Tne 
polymerase protein). The fusion partner may enhance solubility of the Tne polymerase 
protein as expressed in a host cell, may provide an affinity tag to allow purification of 
the recombinant fusion protein from the host cell or culture supernatant, or both. If 
15 desired, the fusion protein may be removed from the protein of interest (i.e., Tne DNA 

polymerase or fragments thereof) by a variety of enzymatic or chemical means known 
to the art. 

The term "5' exonuclease activity" refers to the presence of an activity in a 
protein which is capable of removing nucleotides from the 5' end of an 
20 oligonucleotide. 5' exonuclease activity may be measured using any of the assays 

provided herein. 

The term "3' exonuclease activity" refers to the presence of an activity in a 
protein which is capable of removing nucleotides from the 3* end of an 
oligonucleotide. 3' exonuclease activity may be measured using any of the assays 
25 provided herein. 
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The terms "DNA polymerase activity," "synthetic activity" and "polymerase 
activity" are used interchangably and refer to the ability of an enzyme to synthesize 
new DNA strands by the incorporation of deoxynucleoside triphosphates. The 
examples below provide assays for the measurement of DNA polymerase activity. 

5 The term "reduced levels of 3' exonuclease" is used in reference to the level of 

3' exonuclease activity displayed by the wild-type Tne DNA polymerase (i.e., the 
polymerase of SEQ ID NO:2) and indicates that the modified or "non-natural ly 
occurring" polymerase exhibits lower levels of 3' exonuclease than does the full-length 
or unmodified enzyme. 

10 The phrase "lacks significant 5* exonuclease activity" is used relative to the 

level of 5' exonuclease activity displayed by the wild-type Tne DNA polymerase (i.e., 
the polymerase of SEQ ID NO:2) and indicates that the modified or "non-naturally 
occurring" polymerase exhibits such low levels of 5' exonuclease that the measurement 
is at background levels in the assay. 

15 DESCRIPTION OF THE INVENTION 

The present invention provides purified thermostable DNA polymerase I 
enzymes derived from Thermotoga neapolitana (Tne). These thermostable enzymes 
comprise the wild-type form of the enzyme as well as mutant forms which posses 
altered characteristics relative to the wild-type enzyme. In particular, the present 

20 invention provides deletion mutants which lack 5' exonuclease activity. Further the 

present invention provides modified forms of Tne DNA polymerases which lack 5' 
exonuclease activity and have reduced or absent 3' exonuclease activity. 

The present invention also relates to an improved method of determining the 
nucleic sequence of a DNA molecule using chain terminating dideoxynucleotides in 

25 conjunction with the modified Tne DNA polymerases. The novel properties of the 
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polymerases of the invention provide improved enzymes for a variety of applications 
which utilize thermostable DNA polymerases. 

The description of the invention is divided into: I. General Structural Features 
of Type A DNA Polymerases, II. Generation of Tne DNA Polymerases, III. Use of 
5 Tne DNA Polymerases in the PCR and IV. Use of Tne DNA Polymerases in DNA 

Sequencing Methods. 

I. General Structural Features Of DNA Polymerases 

DNA polymerases (DNAPs), such as those isolated from E. coli or from 
thermophilic bacteria of the genera Thermus or Thermotoga, are enzymes that 
10 synthesize new DNA strands. Several of the known DNAPs contain associated 

nuclease activities in addition to the synthetic or polymerization activity of the 
enzyme. 

Some DNAPs are known to remove nucleotides from the 5' and 3' ends of 
DNA chains [Kornberg, DNA Replication, W.H. Freeman and Co., San Francisco, 

15 pp. 127-139 (1980)]. These nuclease activities are usually referred to as 5' 

exonuclease and 3' exonuclease activities, respectively. For example, the 5' 
exonuclease activity located in the N-tenninal domain of several DNAPs participates in 
the removal of RNA primers during lagging strand synthesis during DNA replication 
and the removal of damaged nucleotides during repair. Some DNAPs, such as the E. 

20 coli DNA polymerase, also have a 3* exonuclease activity responsible for 

proof-reading during DNA synthesis (Kornberg, supra). 

DNAPs isolated from Thermus aquaticus (Taq) y Thermus flavus {Tfl) and 
Thermus thermophilic (Tth) have a 5' exonuclease activity, but lack a functional 3' 
exonucleo lytic domain [Tindall and Kunkell, Biochem. 27:6008 (1988)]. However, the 

25 lack of a 3' exonuclease domain is not a general feature of DNAPs derived from 
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thermophilic bacteria as DNA polymerases from the thermophiles Thermotoga 
maritima (Tma), Bacillus caldotenax, Thermococcus litoralis (77/) and Pyrococcus 
furiosus (Pfu) do contain 3' exonuclease activity. 

The 5' nuclease activity associated with a number of eubacterial Type A DNA 

5 polymerases has been found to reside in the one-third N-terminal region of the protein 

as an independent functional domain. In these polymerase molecules, the C-terminal 
two-thirds of the molecule constitute the polymerization domain which is responsible 
for the synthesis of DNA. Some Type A DNA polymerases also have a 3 ' 
exonuclease activity associated with the two-third C-terminal region of the molecule. 

10 Figure 1 provides a schematic showing the location of the 5' exonuclease, 3' 

exonuclease and polymerase domains of a number of eubacterial DNAPs. As noted 
above, not all DNAPs contain both 5' and 3' exonuclease domains. 

Figure 1 provides a schematic depicting the arrangement of the 5' exonuclease 
("5' EXO"), 3' exonuclease (T EXO") and polymerase ("POL") domains in the DNA 

15 polymerases from phage T4 ("<t>T4") J phage T7 (>T7 M ), E. coli (DNA polymerase I; 

"Eco Pol I"), T. aquaticus ("7a?"), T. maritima ("Tina") and T. neapolitana ("Trie"). 
The absence of a 3' exonuclease domain in Taq DNA polymerase is indicated by the 
use of the line between the boxed 5' exonuclease and polymerase domains; the absence 
of a 5' nuclease domain in phage T4 polymerase is indicated by the absence of the 

20 term "5' EXO" in the first boxed region of the molecule. 

The 5' exonuclease activity and the polymerization activity of DNAPs have 
been separated by proteolytic cleavage or genetic manipulation of the polymerase 
molecule. The Klenow or large proteolytic cleavage fragment of E. coli DNA 
polymerase I contains the polymerase and 3' exonuclease activity but lacks the 5' 

25 nuclease activity [Brutlag et al. y Biochem. Biophys. Res. Commun. 37:982 (1969)]. 

The Stoffel fragment of DNAP Taq lacks the 5' nuclease activity due to a genetic 
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manipulation which deleted the N-terminal 289 amino acids of the polymerase 
molecule [Erlich et aL, Science 252:1643 (1991)]. 

The removal of the 5' exonuclease domain from a DNAP may effect the 
activity of the remaining domains. For example, removal of the 5' exonuclease 

5 domain from the E. coli polymerase I protein to generate the Klenow fragment affects 

the fidelity of the remaining large polymerase domain. The fidelity of a DNA 
polymerase involves several functions including the ability to discriminate against 
errors when nucleotides are initially inserted, discriminate against extension from 
misaligned or mispaired primer termini and exonucleolytic removal of errors. 

10 In comparison to the full-length enzyme, the Klenow fragment exhibits altered 

base substitution error specificity and is less accurate for minus one base frameshift 
errors at reiterated template nucleotides [Bebenek et aL, J. Biol. Chem. 265:13878 
(1990)]. Thus, the removal of the 5' exonuclease domain of E, coli DNA polymerase 
I adversely affects the fidelity of the remaining 3' exonuclease and synthetic domains. 

15 Removal of a 5' exonuclease domain does not always adversely affect the 

fidelity of the resultant polymerase fragment KlenTaq, a truncated version of Taq 
DNA polymerase lacks the first 235 N-terminal amino acids (which includes the 5' 
exonuclease domain) has been reported improved the fidelity of the polymerase 
two-fold [Barnes, Gene 112:29 (1992)]. 

20 Comparison of amino acid sequence in the 3' exonuclease domain of a number 

DNAPs has identified three domains, termed Exo I-III, which are highly conserved 
between a variety of mesophilic and thermophilic organisms [Bernad et aL Cell 59:219 
(1989)]. Figure 2 provides a schematic drawing which aligns the amino acid residues 
from a number of DNAPs over the 3' exonuclease domain. In Figure 2, the one letter 

25 code is used for the amino acids; the numbers represent the amino acid residue in a 

given polymerase. In Figure 2, residues which are highly conserved are indicated by 



- 35 - 



the use of white letters within a black box. Portions of the 3' exonuclease domain of 
following polymerases are shown: Bacilli^ subtillus (Bsu) polymerase III; E. coli 
(Eco) polymerase Hie; phage T4, phage T7, E. coli polymerase I, T. maritima (Tma) 
polymerase and T. neapolitana (Tne) polymerase. The "V" indicates amino acid 
residues involved in single strand DNA binding; the "A" indicates amino acid residues 
involved in metal binding and catalysis. 

Site-directed mutagenesis experiments have identified a subset of these 
conserved residues as being critical for 3* exonuclease activity in E. coli polymerase I. 
The critical residues include D355, D424, D501 which are known to bind divalent 
metal ions and are essential for 3' exonuclease activity; mutation of these residues 
reduces 3* exonuclease activity several thousand fold. L361, F473 and Y497 are also 
important for 3* exonuclease activity and are believed to ensure correct positioning of 
the substrate in the active site. Mutation of L361 and Y497 reduces 3' exonuclease 
activity 12.5 to 25-fold; mutation of F473 reduces 3' exonuclease activity about 
3000-fold. 

PCT Publ. No. WO 92/03556 states that three characteristic domains are critical 
for 3 1 exonuclease activity in thermostable DNA polymerases; however, no 
site-directed mutagenesis is shown for any of the "critical" residues and no 3 1 
exonuclease activity is reported for any of the mutant forms of Tma DNA polymerase 
(primarily deletion mutants) shown. The three domains identified in PCT Publ. No. 
WO 92/03556 are Domain A, which comprises D-X-E-X 3 -L; Domain B, which 
comprises N-X 3 -D-X 3 -L and Domain C, which comprises Y-X 3 -D where X N represents 
the number (N) of non-critical amino acids between the specified amino acids. As 
shown in Figure 2, the location, sequence and spacing of these three domains found in 
polymerases derived from thermophilic organisms is consistent with the three domains 
identified in polymerases derived from mesophilic organisms. 
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While identification of residues which are highly conserved between a number 
of species provides a starting point for the design of site-directed mutagenesis 
experiments, it does not provide an absolute prediction of the effect of a given 
mutation in a particular protein. For example, the present invention shows that 
substitution of the aspartate at position 468 of the Tne DNA polymerase with a 
asparagine virtually eliminates the V exonuclease activity [Tne M284(D468N)]- The 
analogous mutation in the Klenow fragment of DNA polymerase I (D501N) reduces 3' 
exonuclease activity only by 2-fold [Derbyshire et al y EMBO J. 10:17 (1991)]. These 
results underscore the fact that much remains to be learned about structure-function 
relationships and that one cannot predict, with certainty, the effect of a given mutation 
based on analogy to other proteins. 

II. Generation Of Tne DNA Polymerases 

The present invention provides wild-type and modified forms of Tne DNA 
polymerases. The modified forms lack 5' exonuclease activity and some modified 
forms also display reduced or absent 3* exonuclease activity. 

By the term "reduced or absent 3' exonuclease activity" it is meant that the 
modified enzyme has less than the level of 3' exonuclease activity found in the 
wild-type or unmodified enzyme whose protein sequence is listed in SEQ ID NO:2. 
The modified Tne polymerases of the present invention are advantageous in situations 
where the polymerization {i.e., synthetic) activity of the enzyme is desired but the 
presence of 5' exonuclease and/or 3' exonuclease activity is not. 

The present invention is not intended to be limited by the nature of the 
alteration (e.g., deletion, insertion, substitution) necessary to render the Tne polymerase 
deficient in 5' exonuclease or 3' exonuclease activity. The present invention 
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contemplates a variety of methods, including but not limited to proteolysis and genetic 
manipulation. 

1. Reduction Of Exonuclcase Activity By Proteolysis 

Tne DNA polymerases having a reduced level of either or both 5' exonuclease 
5 and 3' exonuclease activity are produced according to the present invention by 

physically cleaving the unmodified enzyme with proteolytic enzymes to produce 
fragments of the enzyme that are deficient in 5' and/or 3' exonuclease activity but 
retain synthetic activity. The proteolysis can remove the N -terminal one third of the 
protein (about residues 1 to 297 in SEQ ID NO:2) to remove 5' exonuclease activity. 
10 Proteolytic cleavage which removes all or a portion of the 3 ' exonuclease domain 

(about residues 298 to 482 in SEQ ID NO: 2) will render the resulting enzyme 
deficient in 3' exonuclease activity. Following proteolytic digestion, the resulting 
fragments are separated by standard chromatographic techniques and assayed for the 
ability to synthesize DNA and to act as a 5' or 3' exonuclease. The assays to 
15 determine synthetic activity and 5' and 3' exonuclease activity are 'described in the 

experimental sections below. 

2. Reduction Of Exonuclease Activity By Genetic 
Manipulation 

The examples below describe preferred methods for creating a construct (i.e., a 
20 vector) encoding a polymerase derived from Tne DNA polymerase I. The wild-type 

Tne polymerase is cloned by isolating genomic DNA using molecular biological 
methods from T. neapolitana cells. The genomic DNA is cleaved into fragments 
about 3 kb or larger using restriction enzymes and the fragments are inserted into a 
suitable cloning vector such as a plasmid or bacteriophage vector; the vectors 
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containing fragments of T. neapolitana genomic DNA are then transformed into a 
suitable E. coli host. Clones containing DNA encoding the Tne polymerase may be 
isolated using functional assays (i.e., presence of thermostable polymerase in lysates of 
transformed cells) or by hybridization using a probe derived from a region of 

5 conservation among DNA polymerases derived from thermostable organisms. 

Alternatively, the T neapolitana genomic DNA may be used as the target in a 
polymerase chain reaction (PCR) where the primers are selected from regions of high 
sequence conservation among the genes encoding thermostable DNA polymerases. 
Such a PCR may not amplify the entire coding region of the Tne polymerase I gene; 

10 in such a case, the full-length Tne gene could be isolated by using the amplified 

fragment as a probe to screen a genomic library containing T neapolitana DNA. 

Once the full-length Tne polymerase gene is obtained, regions encoding the 5' 
exonuclease and/or 3* exonuciease may be altered by a variety of means to reduce or 
eliminate these activities. Suitable deletion and site-directed mutagenesis procedures 

1 5 are described below in the examples. 

Deletion of amino acids from the protein can be done either by deletion of the 
encoding genetic material, or by introduction of a translational stop codon by mutation 
or frame shift. In addition, proteolytic treatment of the protein molecule can be 
performed to remove segments of the protein. 

20 In the examples below, specific alterations of the Tne polymerase gene were: a 

deletion between residues 1-849, a deletion between residues 1-945, a deletion between 
residues 1-966, deletion between residues 1-966, a deletion between residues 1-849 and 
residues 925-1272 and substitutions at residues 946, 947, 967, 968, 969, 975, 1166, 
1167, 1391, 1402, 1407 and 1410. These modified sequences are described below in 

25 the examples and at SEQ ID NOS:7, 10, 15, 18, 22, 25, 28, 32, 34, 36 and 38. 
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Those skilled in the art know that single base changes can be innocuous in 
terms of enzyme structure and function. Similarly small additions and deletions can be 
present without substantially changing the exonuclease or polymerase function of the 
wild-type or modified Tne DNA polymerases. To test whether a particular change is 

5 innocuous in terms of the effect upon enzymatic activity, the polymerase encoded by a 

given DNA sequence is tested for the presence of synthetic activity, 5' exonuclease 
activity and 3' exonuclease activity as in the assays described in the examples below. 
DNA sequence which contain alterations other than those listed in SEQ ID NOS:7, 10, 
15, 18, 22, 25, 28, 32, 34, 36 and 38 but which encode a polymerase molecule having 

10 the properties associated with the polymerases encoded by the above SEQ ID NOS are 
contained within the present invention. 

Other deletions and substitutions are also suitable to create modified Tne DNA 
polymerases lacking 5' and/or 3' exonuclease activity. For example, given the 
degeneracy of the genetic code, several DNA sequences may be used to introduce 

15 substitutions which result in the expression of the same amino acid. It is preferable 

that the alteration decrease the 5' and/or 3' exonuclease activity to*a level which is 
low enough to provide an improved enzyme for a variety of applications such as PCR 
and chain termination sequencing (including thermal cycle sequencing) as discussed 
below in the examples. These modifications will preferably not reduce the synthetic 

20 activity of the modified enzyme. Modified polymerases are tested for the presence of 
synthetic activity and 5* and 3' exonuclease activity as in assays described below. 
Thoughtful consideration of these assays allows for the screening of candidate enzymes 
whose structure is heretofore as yet unknown. In other words, construct "X" can be 
evaluated according to the protocol described below to determine whether it is a 

25 member of the genus of modified Tne polymerases of the present invention as defined 

functionally, rather than structurally. 
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The present invention contemplates that the nucleic acid construct of the 
present invention be capable of expression in a suitable host. In particular it is 
preferable that the expression system chosen utilize a tightly controlled promoter such 
that expression of the Tne polymerase is prevented until expression is induced. In this 
5 manner, potential problems of toxicity of the expressed polymerases to the host cells 

(and particularly to bacterial host cells) is avoided. Those in the art know methods for 
attaching various promoters and 3' sequences to a gene structure to achieve efficient 
and tightly controlled expression. The examples below disclose a number of suitable 
vectors and vector constructs. Of course, there are other promoter/vector combinations 
10 that would be suitable. The choice of a particular vector is also a function of the type 

of host cell to be employed (i.e., procaryotic or eucaryotic). 

It is not necessary that a host organism be used for the expression of the 
nucleic acid constructs of the invention. For example, expression of the protein 
encoded by a nucleic acid construct may be achieved through the use of a cell-free in 
15 vitro transcription/translation system. An example of such a cell-free system is the 

commercially available TnT™ Coupled Reticulocyte Lysate System (Promega; this 
cell-free system is described in U.S. Patent No. 5,324,637, the disclosure of which is 
herein incorporated by reference). 

The nucleic acid construct containing DNA encoding the wild-type or a 
20 modified Tne polymerase may provide for the addition of exogenous sequences (i.e., 

sequences not encoded by the Tne polymerase coding region) to either the 5' or 3' end 
of the Tne polymerase coding region to allow for ease in purification of the resulting 
polymerase protein (the resulting protein containing such an affinity tag is termed a 
fusion protein). Several commercially available expression vectors are available which 
25 provide for the addition of affinity tags (an example of an exogenous sequence) to 

either the amino or carboxy-termini of a coding region; in general these affinity tags 
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are short stretches of amino acids which do not alter the characteristics of the protein 
to be expressed (z\e., no change to enzymatic activities). 

For example, the pET expression system (Novagen) utilizes a vector containing 
the T7 promoter which encodes the fusion protein containing a short stretch of 
histidine residues at either end of the protein and a host cell which can be induced to 
express the T7 DNA polymerase (i.e., a DE3 host strain). The production of fusion 
proteins containing a histidine tract is not limited to the use of a particular expression 
vector and host strain. Several commercially available expression vectors and host 
strains can be used to express protein sequences as a fusion protein containing a 
histidine tract (For example, the pQE series (pQE-8, 12, 16, 17, 18, 30, 31, 32, 40, 41, 
42, 50, 51, 52, 60 and 70) of expression vectors (Qiagen) which are used with the host 
strains M15[pREP4] (Qiagen) and SG13009[pREP4] (Qiagen) can be used to express 
fusion proteins containing six histidine residues at the amino-terminus of the fusion 
protein). Additional expression systems which utilize other affinity tags are known to 
the art. 

Once a suitable nucleic acid construct has been made, the Tne polymerase may 
be produced from the construct The examples below and standard molecular 
biological teachings enable one to manipulate the construct by different suitable 
methods. 

Once the desired Tne polymerase has been expressed, the polymerase is tested 
for both synthetic and exonuclease activity as described below. 

III. Use Of Tne DNA Polymerases In The PCR 

The wild-type and modified Tne polymerases of the present invention provide 
suitable and in some cases superior enzymes for use in the PCR. As shown in the 
examples below, the wild-type and modified forms of Tne polymerase were found to 
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require the use of fewer units of polymerase activity to produce a given amount of 
product DNA in PCRs as compared to wild-type Taq DNA polymerase (i.e., nTaq) or 
a modified form of Tma DNA polymerase (i.e., UITma™). In addition, modified 
forms of Tne polymerase were found to tolerate a broader range of dNTP 

5 concentrations and a broader range of magnesium ion concentrations in the PCR. The 

ability to tolerate a broad range of dNTP is important as it allows flexibility in the 
range of dNTPs to be used in a reaction; additionally, the ability to tolerate a wide 
range of dNTP concentrations demonstrates that the enzymes of the invention provide 
for a robust PCR (i.e., the enzyme is not sensitive to small variations in dNTP 

10 concentration). The ability to produce only specific amplification products over a wide 

range of magnesium ion concentration is advantageous for use in multiplexing PCR 
reactions. 

Several of the modified Tne polymerases provide enzymes having greater 
resistance to thermal inactivation as compared to nTaq or UITma™ DNA polymerases. 
1 5 Greater thermal stability is important for PCR applications as the greater the thermal 

stability of the enzyme, the fewer units of enzyme must be used in the PCR. 

IV. Use Of Tne DNA Polymerases In DNA Sequencing Methods 

The sequence of a deoxyribonucleic acid molecule can be elucidated using 
chemical [Maxam and Gilbert, Proc. Natl. Acad. Set. USA 74:560 (1977)] or 

20 enzymatic [Sanger et al, Proc. Natl. Acad. Sci USA 74:5463 (1977)] methods. The 
enzymatic method of sequencing is based on the ability of a DNA polymerase to 
extend a primer, hybridized to the template that is to be sequenced, until a 
chain-terminating nucleotide is incorporated (referred to as chain terminating 
sequencing). Each sequence determination is carried out as a set of four separate 

25 reactions, each of which contains all four deoxyribonucleoside triphosphates (dNTP) 
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supplemented with a limiting amount of a different dideoxyribonucleoside triphosphate 
(ddNTP). Because ddNTPs lack the 3'-OH group necessary for chain elongation, the 
growing oligonucleotide is terminated selectively at G, A, T, or C, depending on the 
respective dideoxy analog in the reaction. 
5 The relative concentrations of each of the dNTPs and ddNTPs can be adjusted 

to give a nested set of terminated chains over several hundred to a few thousand bases 
in length. The resulting fragments, each with a common origin but ending in a 
different nucleotide, are separated according to size by high-resolution denaturing gel 
electrophoresis. 

10 Incorporation of a radiolabel into the oligonucleotide chain permits the 

visualization of the sequencing products by autoradiography. The end-labeled primer 
protocol, a modification of that described by Heiner et al [(1988) Applied Biosystems, 
Inc. DNA Sequencer Model 370 User Bulletin-Taq Polymerase: Increased Enzyme 
Versatility in DNA Sequencing], uses [y- 32 P]ATP, [ Y - 33 P]ATP or [ Y - 35 S]ATP to label 

15 the sequencing primer. The DNA template and labeled primer are repeatedly annealed 
and enzymatically extended/terminated in thermal cycled sequencing. The end-labeled 
primer protocol is the most versatile sequencing method and is useful when working 
with lambda DNA [Kaledin et aL, Biokhimiya 45:494 (1980)], PCR templates, and 
any template where false priming may be a problem. This protocol generates sequence 

20 data very close to the primer and is recommended when this is needed. The reaction 

also contains deaza nucleotide mixes that substitute 7-deaza dGTP for dGTP. The 
deaza mixes resolve band compressions associated with GC-rich regions [Mizusawa et 
al, Nuci Acids Res. 14:1319 (1986) and Barr et aL, Biotechniques 4:428 (1986)]. 
Thermal cycled sequencing is an alternative method for enzymatic sequence 

25 analysis which takes advantage of the intrinsic properties of thermophilic DNA 

polymerases, such as the one isolated from Thermus aquaticus (Tag DNA polymerase). 
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Because the protocol utilizes a thermocycling apparatus, several advantages are 
realized over conventional sequencing strategies. First, the protocol yields a linear 
amplification of the template DNA, reducing the amount of template required to 
achieve a detectable sequence ladder. Using a 32 P end-labeled primer, greater than 500 
5 bases of sequence can be obtained from as little as 4 fmol (4 x 10~ 15 moles) of 
template after an overnight exposure. Secondly, the high temperatures employed 
during each denaturation cycle eliminate the requirement for alkaline denaturation and 
ethanol precipitation of double-stranded DNA (dsDNA) templates. The denaturation 
cycles also help to circumvent the problems associated with rapid reannealing of linear 
10 dsDNA templates such as PCR reaction products. Third, high annealing temperatures 

increase the stringency of primer hybridization. Fourth, the high polymerization 
temperature decreases the secondary structure of DNA templates and thus permits 
polymerization through highly structured regions [Innis et ai, Proc. NatL Acad. Sci 
USA 85:9436 (1988)]. This system is useful for sequencing a wide variety of 
15 templates such as amplified DNA, large double-stranded DNA templates such as 
lambda, GC-rich templates and palindrome-rich templates. 

Existing thermostable polymerases which are used in chain termination methods 
of sequencing (both traditional and thermal cycling protocols) require fairly high 
concentrations of ddNTPs as the affinity of these polymerases for ddNTPs is somewhat 
20 low. For example, when Tag DNA polymerase is employed for chain termination 

sequencing protocols, the optimal concentrations of ddNTPs in the ddNTP termination 
mixtures (3X mixtures): 180 fiM ddGTP, 1 mM ddATP, 1.5 mM ddTTP and 500 /xM 
ddCTP [as described in U.S. Patent 5,075,216, the disclosure of which is herein 
incorporated by reference]. When the polymerase employed is a modified form of Taq 
25 DNA polymerase, sTaq (sequencing grade Taq), the optimal concentrations of ddNTPs 

in the ddNTP termination mixtures: 30 ptM ddGTP, 350 M M ddATP, 600 /xM ddTTP 
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and 200 jiM ddCTP. In contrast, a modified form of Tne DNA polymerase provided 
herein utilizes the following concentrations of ddNTPs in the termination mixtures (3X 
mixtures): 20 yM ddGTP, 50 M M ddATP, 75 ddTTP and 25 fiU ddCTP. Because 
ddNTPs are expensive, the use of a thermostable polymerase having a higher affinity 
for ddNTPs (i.e., the modified Tne polymerase of the invention) will result in 
considerable cost savings in DNA sequencing applications. 

EXPERIMENTAL 

The following examples serve to illustrate certain preferred embodiments and 
aspects of the present invention and are not to be construed as limiting the scope 
thereof. 

In the disclosure which follows, the following abbreviations apply: °C (degrees 
Centigrade); g (gravitational field); vol (volume); w/v (weight to volume); v/v (volume 
to volume); BSA (bovine serum albumin); CTAB (cetyltrimethylarnmonium bromide); 
fmol (femtomole); HPLC (high pressure liquid chromatography); DTT (dithiothrcitol); 
DMF (N, N dimethyl formamide); DNA (deoxyribonucleic acid); i.d. (internal 
diameter); p (plasmid); /d (microliters); ml (milliliters); pg (micrograms); pmoles 
(picomoles); mg (milligrams); MOPS (3-[N-Morpholino]propanesulfonic acid); M 
(molar); mM (milliMolar); /iM (microMolar); nm (nanometers); kdal (kilodaltons); OD 
(optical density); EDTA (ethylene diamine tetra-acetic acid); FITC (fluorescein 
isothiocyanate); SDS (sodium dodecyl sulfate); NaP0 4 (sodium phosphate); Tris 
(tris(hydroxymethyl)-aminomethane); PMSF (phenylmethylsulfonylfluoride); TBE 
(Tris-Borate-EDTA, i.e., Tris buffer titrated with boric acid rather than HC1 and 
containing EDTA) ; PBS (phosphate buffered saline); PPBS (phosphate buffered saline 
containing 1 mM PMSF); PAGE (polyacrylamide gel electrophoresis); Tween 
(polyoxyethylene-sorbitan); Boehringer Mannheim (Boehringer Mannheim, 
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Indianapolis, IN); Epicentre (Epicentre Technologies, Madison, WI); New England 
Biolabs (New England Biolabs, Beverly, MA); Novagen (Novagen, Inc., Madison, 
WI); Pharmacia (Pharmacia Biotech Inc., Piscataway, NJ); Perkin Elmer (Perkin 
Elmer, Norwalk, CT); Promega (Promega Corp., Madison, WI); Qiagen (Qiagen Inc., 
Chatsworth, CA); Spectra (Spectra, Houston, TX); Stratagene (Stratagene Cloning 
Systems, La Jolla, CA); USB (U.S. Biochemical, Cleveland, OH). 

EXAMPLE 1 

Isolation Of The Tne DNA Polymerase Gene 

a) Growth of T. neapolitana Cells 

T. neapolitana cells (obtained from V.A. Svetlichny, The Institute of 
Microbiology, Russian Academy of Sciences, Moscow) were grown in a medium 
containing (per 100 ml): 0.1 ml K-phosphate solution [300 g/1 K 2 HPO< and 200 g/1 
KH 2 POJ; 1 ml of Solution 1 [27 g/1 NH 4 C1, 27 g/1 CaCl 2 and 31 g/1 MgCl 2 -6H 2 0]; 1 
ml of a 10% solution of yeast extract (Difco); 2.5% natural sea salt; 0.1 ml of a 2% 
solution of resazurin; 1 ml of 5% NajSOj*, 150 mg NaHC0 3 and 0.5 % glucose. 

b) Isolation Of Genomic DNA 

Large scale cultures (10 liters) of T. neapolitana cells were grown in the above 
medium in a 10 1 fermentation vessel under nitrogen (i.e., anaerobic conditions) at 
75°C for 28 hours (early stationary phase). The cells were then collected by 
centrifugation at 10,000 x g and the cell pellet was washed once with a solution 
comprising 0.9% NaCl. The washed cell pellet was frozen at -70°C. DNA was 
isolated from the frozen cells as follows. The frozen cells (3 g) were thawed in 30 ml 
of a solution containing 100 mM Tris-HCl (pH 9.0), 50 mM EDTA and 2 mg/ml 
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Lysozyme. The mixture was incubated for 30 min at 0°C and then SDS and proteinase 
K was added to a final concentration of 1% and 100 /xg/ml, respectively. The mixture 
was incubated for 1.5 hours at 45°C with light shaking. Following the incubation, the 
mixture was cooled to room temperature (about 25°C) and NaCl was added to a 
5 concentration of 0.5 M. An equal volume of phenol/chloroform was added and the 
mixture was extracted and the aqueous and organic phases were separated by 
centrifugation for 10 min at 6,000 x g at room temperature. The supernatant was 
transferred to a fresh tube using a wide-bore pipet. A total of six phenol/chloroform 
extractions were performed (until the interphase disappeared). The DNA was 
10 precipitated by the addition of ethanol and gently mixing the solution. The 

precipitated DNA was washed with 70% ethanol. The DNA was then centrifuged for 
5 min at 10,000 x g and the supernatant was discarded. The pellet was resuspended in 
a buffer containing 10 mM Tris-HCl (pH 7.4), 0.1 mM EDTA and stored at -20°C 
until used. 

15 c ) Construction Of A T. neapolitana Genomic DNA 

Library 

The T. neapolitana DNA was then digested with Saul A under conditions which 
promoted the generation of fragments 3-8 kb in length. Briefly, 10 ug of genomic 
DNA was digested with L6 units of Saul A in a volume of 15 for 1 hour at 37°C. 

20 The reaction was stopped by the addition of 5 ul of sample buffer [70% glycerol, 50 
mM EDTA] and the digested DNA was run on a low melting temperature agarose gel 
(BioRad). Fragments 3-8 kb in length were isolated from the gel using standard 
procedures [Sambrook et a/., Molecular Cloning: A Laboratory Manual, 2nd ed., Cold 
Spring Harbor Laboratory Press, NY (1989) pp. 6.30-6.31]. The DNA recovered from 

25 the gel was precipitated with ethanol, dried and resuspended in 100 ul of 10 mM 
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Tris-HCl (pH 7.5), 0.1 mM EDTA. Five microliters of the DNA mixture (about 0.1 
jig) was ligated to 0.25 ug of the pTZ19R vector (Pharmacia) which had been digested 
with BamUl and treated with bacterial alkaline phosphatase. The ligation products 
were used to transform competent TGI cells [TGI cells are an EcoK' derivative of 
JM101 cells; a commercially available equivalents include NM522 cells (Pharmacia) 
and XL 1 -Blue cells (Stratagene)] and the cells were plated onto MacConkey agar 
plates (Difco). 

White colonies (i.e., those containing plasmids having an insert) were picked 
onto 30 master plates (96 colonies/plate; a total of about 3,000 colonies were 
screened). Replica plates were generated to provide cells for DNA polymerase 
analysis. The library was screened by functional assay; the cells from each replica 
plate were removed and pooled by rinsing the plate with 2 ml of 0.9% NaCl. The 
cells were then collected by centrifugation (12,000 rpm for 3 min) in a microcentrifuge 
(Eppendorf). The pellets were then washed with 1.5 ml of 0.9% NaCl. The washed 
cell pellets were then lysed by sonication in 0.5 ml of a solution containing 50 mM 
Tris-HCl (pH 7.4), 1 mM EDTA and 5 mM PMSF. The lysates were then heated to 
72°C-75°C in a water bath for 20 min. Following the incubation, the lysates were 
clarified by centrifugation in a microfuge at 12,000 rpm for 10 min. The supernatant 
was removed to a fresh tube. DNA polymerase activity was assayed using the 
supernatant as follows. 

Denatured bovine thymus DNA was prepared as follows. A solution 
comprising 6 mM bovine thymus DNA (BioLAR, Olaine, Latvia; equivalent 
preparations of calf thymus DNA are available from Sigma, St. Loius, MO) in 1 mM 
NaOH was incubated for 15 min at 20°C. The solution was then neutralized by the 
addition of HC1 to a final concentration of 100 mM and Tris-HCl, pH 8.0 to a final 
concentration of 50 mM. 
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The following components were mixed: 2.5 id 0.5 M Tris-HCl (pH 7.6), 5 /d 
100 mM MgCi 2 , 2 /xl denatured bovine thymus DNA (2 mg/ml), 0.2 y\ 12.5 mM of 
each of the dNTPs and 1.5 /dCi of a- 32 P-dTTP and H 2 0 to a volume of 25 fil 
Twenty-five microliters of supernatant from each of the pools of lysed cells were 
mixed with 25 yX of the above assay mixture in the well of a 96 well microtiter plate. 
The mixture was incubated for 1 hour at 75°C. The reaction was stopped by the 
addition of 5 /xl of 200 mM EDTA (pH 8.0). Five microliters of the reaction mixture 
was then loaded onto a 1 X 1 inch square of DEAE paper (Whatman). The samples 
were dried at 80°C and then washed with 0.5 M sodium phosphate (pH 7.2) (wash 
solution) using about 5 ml of wash solution per sample for 10 min with light shaking. 
Three washes were performed. The samples were then rinsed with water (10 
ml/sample) for 1 min followed by an ethanol rinse. The ethanol rinsed samples were 
then dried at 80°C and DEAE-absorbent radioactivity was counted using a liquid 
scintillation counter. The results of the initial DNA polymerase assays revealed that a 
single pool produced DNA polymerase activity. 

To isolate clones containing Tne genomic DNA encoding the DNA polymerase 
activity, the colonies on the positive master plate were grown as smaller pools 
comprising either a single row or a single column of colonies. The smaller pools of 
colonies were grown, lysates were prepared and DNA polymerase activity was 
determined as described above. A single row and a single column contained DNA 
polymerase activity; the intersection of this row and column identified the single 
individual clone containing DNA encoding DNA polymerase activity. This single 
colony was grown and assayed for DNA polymerase activity to confirm the presence 
of thermostable DNA polymerase activity. This clone was called pTen. 

DNA was prepared from the pTen clone using standard techniques of molecular 
biology; this clone was found to contain a insert of approximately 3.5 kb. Restriction 
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enzyme digests were performed with a battery of enzymes to create a restriction map 
of the Tne genomic DNA insert. Subclones were generated from the positive clone 
and a series of nested deletions were generated using Exonuclease III and standard 
molecular biology techniques to facilitate DNA sequencing [Short Protocols in 
5 Molecular Biology, 2nd ed. (1992) Ausubel et ai Eds, John Wiley & Sons, New York, 
pp.7-8 to 7-16 and 7-29 to 7-37]. The DNA sequence of the insert was determined 
using the Sanger dideoxy sequencing method and Sequenase® (USB). The DNA 
sequence of the coding region for the full-length Tne DNA polymerase gene is listed 
in SEQ ID NO:l. The deduced amino acid sequence of the Tne DNA polymerase is 
10 listed in SEQ ID NO:2. 

Analysis of the deduced amino acid sequence was performed using protein 
analysis software (DNAStar, Inc., Madison, WI). The open reading frame encodes a 
protein of 893 amino acids; the predicted molecular weight of the protein is 102,054 
(however, as shown in Example below, the full-length protein migrates with an 
15 apparent molecular weight of 97,000 on SDS-PAGE gels). The predicted isoelectric 
point is 6.19 and the charge at pH 7.0 is -7.56. 

The nucleotide and amino acid sequences of the Tne DNA polymerase were 
compared with the reported sequences for £. coli DNA polymerase I and the 
thermostable DNA polymerase from T. maritima. Figure 3 provides an alignment of 
20 the amino acid residues of these three polymerases. In Figure 3 the following 

abbreviations are used: Eco (£. coli DNA polymerase I); Tma (Tma DNA polymerase) 
and Tne (Tne DNA polymerase). Shading is used to indicate residues which differ 
from the amino acid sequence of Tne DNA polymerase. 

In E. coli DNA polymerase I, the 5' exonuclease domain comprises 
25 approximately residues 1-323; the V exonuclease domain comprises approximately 

residues 324-517 and the synthetic or polymerization domain comprises approximately 
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residues 521-928. Alignment of the amino acid sequences of E. coli DNA polymerase 
I with the sequence of the Tne DNA polymerase molecule of the present invention 
reveals that the 5' exonuclease domain of Tne DNA polymerase comprises 
approximately residues 1-297; the V exonuclease domain comprises approximately 
residues 298-482 and the polymerization domain comprises approximately residues 
486-893. 

Alignment of amino acid residues present in K coli DNA polymerase I and Tne 
DNA polymerase shows that the two enzymes are 44% identical overall and 51% 
identical over the polymerase domain (residues 521-928 in E. coli and residues 
486-893 in Tne). The alignment was performed using the Lipman-Pearson algorithm 
as provided by DNASTAR, Inc. (Madison, Wl); gaps were introduced into the two 
sequences to provide for maximum alignment. 

Comparison of the nucleotide and amino acid sequences of the Tne and Tma 
polymerases revealed that these two polymerases share 78% identity at the nucleotide 
level and 88% identity at the amino acid level. 

EXAMPLE 2 

Efficient Expression Of Tne DNA Polymerase In E. coli 

In order to express the Tne DNA polymerase in large amounts in host cells, the 
DNA sequences encoding the polymerase (/.c, the coding region) were removed from 
pTen (described in Example 1) and inserted into an expression vector. 

Restriction enzyme analysis revealed that the Tne polymerase gene was present 
in the opposite transcriptional orientation relative to the T7 promoter present on the 
pTZ19R vector used to generate pTen. In order to produce Tne polymerase, sequences 
encoding the Tne polymerase gene were removed from pTen by digestion with Sma\ 
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and Xbal and an approximately 3.5 kb SmaUXbal fragment was isolated by 
electrophoresis of the digestion products on an agarose gel followed by excision of the 
desired band. DNA was recovered from the agarose block using the Wizard™ PCR 
Preps DNA Purification System (Promega). Briefly, 0.5 to 1.0 ml of Wizard™ PCR 
Preps DNA Purification Resin was added to the agarose block and the mixture was 
incubated at 42°C for 5 minutes to melt the agarose. DNA was extracted using the 
protocol provided in the kit. 

The 3.5 kb SmaUXbal fragment was ligated into the pGEM®-3Zf(+) vector 
(Promega) which had been digested with Smal and Xbal to generate pGTne. This 
construction placed the 5' end (i.e., encodes the N terminal portion) of the Tne 
polymerase gene downstream of the T7 promoter in the same transcriptional 
orientation. The ligation mixture was used to transform competent JM109(DE3) cells 
(Promega). Recombinant clones were isolated, confirmed by restriction digestion 
using standard recombinant molecular biology techniques [Sambrook et a/., Molecular 
Cloning, A Laboratory Manual (1989)]. 

Expression of the Tne DNA polymerase from the pGTne construct was next 
examined. JM109(DE3) cells containing pGTne were grown at 37°C and induced with 
1 mM IPTG. Parallel cultures of TGI cells containing pTen were grown and induced. 
After a few hours (/.e, 1-3) of growth in the presence of IPTG, the cells were collected 
by centrifugation and crude ly sates were prepared as follows. A 1 ml aliquot of each 
of the cultures containing the Tne constructs was centrifuged in a microcentrifuge at 
14,000 x g for 3 min at room temperature to pellet the cells. The cells were then 
resuspended in 200 /xl of a solution comprising 50 mM Tris-HCl (pH 8.0), 50 mM 
glucose, 1 mM EDTA and the cells were pelleted again. The cells were next 
resuspended in 50 /xl of the previous buffer containing 4 mg/ml lysozyme and the 
mixture was incubated at room temperature for 10 minutes. Following the incubation, 
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50 id of a solution comprising 10 mM Tris-HCl (pH 8.0), 50 mM KCl, 1 mM EDTA, 
1 mM PMSF, 0.5% Tween-20, 0.5% Nonidet P40 was added and the mixture was 
incubated at 75°C for 10 minutes. The lysate was then clarified by centrifugation in a 
microcentrifuge at 14,000 x g for 5 minutes. Eighty microliters of the supernatant was 
removed and stored in a separate tube at 4°C. The crude lysates were analyzed for 
polymerase activity at 74°C as described in Example 5(b), below. 

The following results were obtained. The pTen construct gave polymerase 
activity levels at or about 2 fold higher than the background level for the assay. The 
pGTne construct gave activity levels of about 50 times background levels. While the 
expression of Tne polymerase seen using pGTne was much improved relative to the 
level seen using pTen, this expression level was not sufficient to produce large 
amounts of the enzyme. 

These above result suggested that the Tne polymerase promoter was 
non-functional in E. coli (very low level of activity present when pTen is used). 
15 Furthermore, the presence of the Tne polymerase promoter appeared to be detrimental 

to expression when transcription was initiated from the T7 promoter in the pGTne 
construct (perhaps due to transcriptional read-through interference). In order to 
remove the Tne polymerase gene promoter from the Tne polymerase coding region, the 
following experiments were conducted. 
20 The DNA sequence of the 5' end of the Tne polymerase gene was sequenced 

using the Ml 3 forward primer in conjunction with the fmol* DNA Sequencing System 
(Promega); sequencing was conducted according to the manufacturer's instructions. 
The sequence analysis revealed that a unique BgR site was found 43 bp into the coding 
region {i.e., 43 bp following the A of the initiator ATG codon). To remove the coding 
25 region of the Tne polymerase gene from pGTne, pGTne was digested with Bgli and 

Xbal and the approximately 3.0 kb BgWXbal fragment was isolated (as described 
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above). The 3.0 kb BgWXbal fragment was ligated directly downstream of either the 
T7 promoter or the tac promoter present in pALTER®-Exl (Promega); 
pALTER®-Exl contains both the T7 and the tac promoters positioned in opposite 
transcriptional orientations relative to one another. These two ligations were 
performed as follows. 

To insert the 3.0 kb BgWXbal fragment downstream of the T7 promoter, a 43 
bp synthetic linker having a Ncol overhanging end at one end and a BgH overhanging 
end at the other end was ligated to the Tne polymerase coding region. This linker was 
formed by annealing of the following two oligonucleotides: JH64 which comprises 
5 '-CATGGCGAGACTATTTCTCTTTGATGGCACAGCCCTGGCCTACA-3 ' (SEQ 
ID NO:3) and JH65 which comprises 5 ' - AGGCC AGGGCTGTGCC ATC AAAG AG AA 
ATAGTCTCGC-3 * (SEQ ID NO:4). This synthetic linker regenerates the native 
sequence of the Tne polymerase gene located upstream of the BgH site and allows 
insertion of the coding region into pALTER®-Exl. pALTER®-Exl was digested with 
Ncol and Xbal and the coding region containing the synthetic linker was ligated to the 
digested vector to generate pATne2. 

To insert the 3.0 kb BgWXbal fragment downstream of the tac promoter, a 43 
bp synthetic linker having a Ndel overhanging end at one end and a BgR overhanging 
end at the other end was ligated to the Tne polymerase coding region. This linker is 
formed by annealing of the following two oligonucleotides: JH62 which comprises 
5-AGGCCAGGGCTGTGCCATCAAAGAGAAATAGTCTCGCCA (SEQ ID NO:5) 
and JH63 which comprises 5 ' -T ATGGCG AG ACT ATTTCTCTTTGTGGC AC AGCCCT 
GGCCTACA-3' (SEQ ID NO:6). This synthetic linker regenerates the native 
sequence of the Tne polymerase gene located upstream of the BgR site and allows 
insertion of the coding region into pALTER®-Exl. p ALTER®- Ex 1 was digested with 
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Ndel and Xbal and the coding region containing the synthetic linker was ligated to the 
digested vector to generate pATnel. 

Competent E. coli cells were transformed with the above ligation mixtures 
corresponding to either pATnel (JM109 cells; Promega) and P ATne2 [JM109(DE3) 
cells; Promega]. Recombinant clones were isolated, confirmed by restriction digestion 
using standard recombinant molecular biology. Cells harboring either pATnel or 
P ATne2 were grown and induced as described above. Crude lysates were prepared 
and DNA polymerase assays were performed (as described above). The results of 
these polymerase assays showed that both pATnel and P ATne2 gave significantly 
better yields of Tne polymerase than pGTne (at least 2-3 fold higher). 

EXAMPLE 3 

Construction Of Tne Deletion Mutants Lacking 5* To 3' Exonuclease Activity 

As noted above, the presence of 5' to 3' exonuclease activity in a thermostable 
DNA polymerase is undesirable for certain applications. To construct mutant Tne 
polymerases lacking 5' to 3' exonuclease activity, two deletion mutants of the Tne 
polymerase gene were generated. Both mutants contain deletions which remove 
sequences encoding a large portion of the 5* to 3' exonuclease domain located at the 
N terminus of the Tne polymerase molecule. 

a) Construction Of A Vector Encoding Deletion Mutant 
Tne M284 

The deletion mutant Tne M284 is a truncated form of the Tne polymerase 
which uses the naturally occurring methionine at amino acid position 284 in the 
full-length protein (SEQ ID NO:2) as the initiating methionine for translation 
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initiation. Figure 4 provides a schematic representation of several modified Tne 
polymerases (constructed as described in Examples 3 and 4) along the map of the 
full-length Tne polymerase protein. The scale represents length in increments of 100 
amino acid residues. The full length Tne polymerase (SEQ ID NO:2) contains 893 
5 amino acids. The thick open boxes represent the presence of amino acid residues; thin 

lines between two regions of thick boxes indicates that amino acids were deleted 
between the two open boxes. Circles containing a single letter indicate the location of 
a mutated amino acid residue (the single letter code is used for the amino acid residues 
indicated). 

10 To generate a construct containing the Tne M284 mutant, pGTne was digested 

with BspUl (generates ends compatible with Ncol ends) and Kpnl and a 1.05 kb 
BspYR/Kpnl fragment (containing the 5' portion of the coding region) was isolated as 
described in Example 2. A second aliquot of pGTne was digested with Kpnl and Xbal 
and a 1.3 kb fragment containing the 3' portion of the Tne polymerase coding region 

15 was isolated. pALTER-Exl was digested with Ncol and Xbal. A three-way ligation 

was performed using the digested pALTERExl vector, 1.05 kb BspHl/Kpnl fragment 
and the 1.3 kb Kpnl/Xbal fragment. Competent JM109(DE3) cells were transformed 
with the ligation mixture and recombinant clones were isolated, cx>rifirmed by 
restriction digestion using standard recombinant molecular biology. The resulting 

20 plasmid was called pM284. The nucleotide sequence of the region encoding the Tne 

M284 gene is listed in SEQ ID NO:7. The amino acid sequence of Tne M284 is listed 
in SEQ ID NO:8. 
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b) Construction Of A Vector Encoding Deletion Mutant 
Tne M316 

The deletion mutant Tne M316 is a truncated form of the Tne polymerase 
protein which uses an artificially created methionine at amino acid position 316 as the 
initiator methionine. Tne M316 was created by introducing a methionine residue at 
position 316 (and a corresponding Ncol site) via site-directed mutagenesis using the 
Altered Sites® II in vitro Mutagenesis System (Promega) in conjunction with 
mutagenesis oligonucleotide JH68 [5 ' -ATCGAAAAGCTG ACCATGGTTCCATCTT 
TTG-3' (SEQ ID NO:9)] and pATne2. The manufacturer's protocol was followed 
exactly. Briefly, pATne2 was denatured using alkali and the JH68 mutagenic 
oligonucleotide was annealed to the denatured plasmid along with the ampicillin repair 
oligonucleotide (provided in the kit). The mutant strand was synthesized using T4 
DNA polymerase and T4 DNA ligase. ESttOlmutS ceils (provided in the kit) were 
then cotransformed with the mutagenized pATne2 and R408 DNA (provided in the 
kit). Small scale DNA preparations were prepared from the transformed ESUOlmutS 
cells and the DNA was used to transform JM109 cells. Mutants were selected by 
growth on ampicillin plates and the desired recombinant were confirmed by restriction 
enzyme analysis (i.e., presence of an additional Ncol site). The resulting plasmid 
containing the engineered Ncol site was then digested with Ncol which deleted the 5* 
to 3' exonuclease domain as a 948 bp fragment and the large fragment was isolated 
(as described above) and religated to itself to create the pM316 construct. The 
nucleotide sequence of the region encoding the Tne M316 gene is listed in SEQ ID 
NO: 10. The amino acid sequence of Tne M316 is listed in SEQ ID NO:ll. 
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c) Expression Of Tne M284 And Tne M316 In E. coli 
The pM284 and pM316 constructs (in JM109 cells) were grown, induced and 
crude lysates were prepared as described in Example 2. DNA polymerase activity was 
measured in crude lysates as described in Example 2. 
5 The results of the polymerase assays showed that the Tne M284 mutant 

(pM284) contained 17 units per p\ of polymerase activity from the crude lysate and 
the Tne M316 mutant (pM316) produced no detectable polymerase activity. No 
detectable polymerase activity was found when the Tne M316 polymerase was 
expressed from the trc promoter either (to express the Tne M316 coding region from 
10 the trc promoter, a 2.5 kb NcoVPstl fragment was isolated from pM316 and ligated to 
pTrc 99 A (Pharmacia) digested with Nco\ and Pstl). 

Aliquots (5 pi) of each crude lysate were electrophoresed on a pre-cast 4-20 % 
denaturing gradient polyacrylamide gel (Novex, San Diego, CA); following 
electrophoresis, the gel was stained with Coomassie blue to visualize the separated 
15 proteins. A single, sharp protein band corresponding to the expected size was visible 

in lysates produced from cells containing the pATnel (full-length Tne polymerase) and 
pM284 (Tne M284 deletion mutant) constructs. No protein band was observed for 
deletion mutant Tne M316 when expressed from either the T7 or trc promoters. 

In order to increase the level of expression of Tne M284 protein in E. coli, the 
20 Tne M284 coding region was placed downstream of the strong tac promoter present in 

the JHEX3 vector to create pJM284. JHEX3 was created as follows, p ALTER- 1 
(Promega) was digested with Clal and Styl and the ends were made blunt by 
incubation with the Klenow fragment. The 1.345 kb ClaVStyl fragment was isolated 
and ligated into pTrc 99 A (Pharmacia) which had been digested with BsaM. This 
25 ligation inserted the tetracycline-resistance gene into the pTrc 99 A vector; the 

resulting vector was called JHEXa. The ampicillin-resistance gene was then removed 
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from the JHEXa by digestion with Sspl, Oral and Pvul\ this digestion cut the 
ampicillin gene into four small fragments (483 bp, 227 bp, 209 bp and 19 bp). The 
large fragments (3.93 kp and 652 bp) were isolated and ligated together to create 
JHEXb. The Trc promoter was removed from JHEXb as an 89 bp SspUNcol fragment 
and replaced with the tac promoter. The tac promoter was inserted into the 
SspI/Atol-digested JHEXb vector as a 141 bp BsrBl fragment from pALTER-Exl 
(Promega) together with a 30 bp linker formed by the oligonucleotide pair listed in 
SEQ ID NOS:45 and 46. 

To generate pJM284 construct was made as follows. pGTne was digested with 
BspUl (generates ends compatible with Ncol ends) and Kpnl and a 1.05 kb 
BspHl/Kpnl fragment (containing the 5' portion of the coding region) was isolated as 
described in Example 2. A second aliquot of pGTne was digested with Kpnl and Xbal 
and a 1.3 kb fragment containing the 3' portion of the Tne polymerase coding region 
was isolated. JHEX3 was digested with Ncol and Xbal. A three-way ligation was 
performed using the digested JHEX3 vector, 1.05 kb BspW/Kpnl fragment and the 1.3 
kb KpnVXbal fragment. Competent JM109 cells were transformed with the ligation 
mixture and recombinant clones were isolated, confirmed by restriction digestion using 
standard recombinant molecular biology. The resulting plasmid was called pJM284. 

Crude ly sates were prepared from a small scale culture of JM109 cells 
containing the pJM284 construct or the pM284 construct. DNA polymerase assays 
were performed as described in Example 2. The level of Tne M284 polymerase 
produced by pJM284 was found to be about 50% greater than the level produced by 
expression from the pM284 construct. 
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EXAMPLE 4 

Construction Of Tne Polymerase Mutants 
Having Altered 3* To 5* Exonuclease Activity 



In order to produce modified forms of Tne polymerase which possess varying 
amounts of 3' to 5' exonuclease activity, seven different point mutants and two 
deletion mutants were created using the pM284 construct as the starting material. 
Figure 4 provides a schematic drawing of these mutant Tne polymerases. 

All nine mutagenic changes also involved a change in the restriction digest 
pattern of the starting pM284 plasmid to allow for easy selection of the mutants. In 
all cases, a small portion of the mutagenized region was exchanged into an Tne M284 
gene that did not undergo mutagenesis and the exchanged region was sequenced not 
only to confirm the mutation, but also to show that there were no second site 
mutations. DNA sequencing was performed using the fmol® DNA Sequencing 
System (Promega) in conjunction with using primers JH61 [5'-TGCCGTACACCTCC 
GAGAGC-3* (SEQ ID NO: 12)] or JH66 [5 ' -CTCGTTTGGCTCCAGCAAATATGC-3 ' 
(SEQ ID NO: 13)]. The mutants were constructed as follows. 

a) Construction Of pD323E 

pD323E produces a modified form of the Tne polymerase which lacks the first 
283 amino acids from the N-terminus of the full-length protein and contains an amino 
acid substitution at residue 323 (number indicates position of the residue in the full 
length protein). At amino acid residue 323 the wild-type aspartic acid is replaced with 
glutamic acid. pM284 was used in conjunction with the mutagenic oligonucleotide 
JH74 [5 ' -TTTGCCCTGGAaCTTG AAACG-3 ' (SEQ ID NO: 14)]; the mutagenic 
residues are indicated by the lower case letter] and the Altered Sites® II in vitro 
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Mutagenesis System (Promega) to generate pD323E as described in Example 3. The 
desired mutants were confirmed by restriction analysis (absence of one of the Sinl 
restriction sites present in pM284. The DNA sequence of pD323E was obtained as 
described above using the JH66 (SEQ ID NO: 13) primer. The DNA sequence of the 
polymerase coding region present in pD323E is listed in SEQ ID NO: 15. The 
corresponding amino acid sequence of the Tne M284(D323E) protein is listed in SEQ 
ID NO:16. 

b) Construction Of pE325D 

pE325D produces a modified form of the Tne polymerase which lacks the first 
283 amino acids from the N-terminus of the full-length protein and contains an amino 
acid substitution at residue 325. At amino acid residue 325, the wild-type glutamic 
acid residue is replaced with aspartic acid. pM284 was used in conjunction with the 
mutagenesis oligonucleotide JH75 [5'-GACCTTGAcACGTCCTC (SEQ ID NO:17);the 
mutagenic residue is indicated by the lower case letter] and the Altered Sites® II in 
vitro Mutagenesis System (Promega) to generate pE325D as described in Example 3. 
The desired mutant was confirmed by restriction analysis (the presence of additional 
4/7III restriction site). The DNA sequence of pD323 was obtained as described above 
using the JH66 (SEQ ID NO: 13) primer. The DNA sequence of the polymerase 
coding region present in pE325D is listed in SEQ ID NO: 18. The corresponding 
amino acid sequence of the Tne M284(E325D) protein is listed in SEQ ID NO: 19. 

c) Construction Of pY464F 

pY464F produces a modified form of the Tne polymerase which lacks the first 
283 amino acids from the N-terminus of the full-length protein and contains an amino 
acid substitution at residue 464. At amino acid residue 464, the wild-type tyrosine 
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residue is replaced with phenylalanine. pY464F was constructed by replacing a 39 bp 
Fokl fragment present in pM284 with a 39 bp synthetic region. The 39 bp synthetic 
region was formed by the following two oligonucleotides: JH81 [5 ' -TAAGTG ATATC 
TGCATCCTCGCAGGAGAAGTTCGCAGCC-3 ' (SEQ ID NO:20) and JH82 

5 [5 '-ACAAGGCTGCGAACTTCTCCTGCGAGGATGCAGATATCA-3 ' (SEQ ID 

NO:21)]. This synthetic 39 bp oligonucleotide contains the mutation. The desired 
mutant was confirmed by restriction analysis (the presence of additional EcoKV 
restriction site). The DNA sequence of pY464F was obtained as described above 
using the JH61 (SEQ ID NO: 12) primer. The DNA sequence of the polymerase 

10 coding region present in pY464F is listed in SEQ ID NO:22. The corresponding 
amino acid sequence of the Tne M284(Y464F) protein is listed in SEQ ID NO:23. 

d) Construction Of pD468N 

pD468N produces a modified form of the Tne polymerase which lacks the first 
283 amino acids from the N-terminus of the full-length protein and contains an amino 

15 acid substitution at residue 468. At amino acid residue 468, the wild-type aspartic acid 

residue is replaced with asparagine. pM284 was used in conjunction with the 
mutagenesis oligonucleotide JH79 [5 ' - ACTCCTGCG AGaATGCtG AC ATC ACTTAT 
AGG-3' (SEQ ID NO:24); the mutagenic residues are indicated by the use of lower 
case letters] and the Altered Sites® II in vitro Mutagenesis System (Promega) to 

20 generate pD468N as described in Example 3. The desired mutant was confirmed by 
restriction analysis (the presence of an additional Bsml restriction site). The DNA 
sequence of pD468N was obtained as described above using the JH61 (SEQ ID 
NO: 12) primer. The DNA sequence of the polymerase coding region present in 
pD468N is listed in SEQ ID NO:25. The corresponding amino acid sequence of the 

25 Tne M284(D468N) protein is listed in SEQ ID NO:26. 
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e) Construction Of pD323A 

pD323A produces a modified form of the Tne polymerase which lacks the first 
283 amino acids from the N-terminus of the rull-length protein and contains an amino 
acid substitution at residue 323. At amino acid residue 323, the wild-type aspartic acid 

5 residue is replaced with alanine. pM284 was used in conjunction with the mutagenesis 

oligonucleotide JH70 [5 * -TTTGCCCTGGcCCTTGAAACG-3 ' (SEQ ID NO:27); the 
mutagenic residue is indicated by the use of the lower case letter] and the Altered 
Sites® II in vitro Mutagenesis System (Promega) to generate pD323A as described in 
Example 3. The desired mutant was confirmed by restriction analysis (the absence of 

10 a Sinl restriction site). The DNA sequence of pD323A was obtained as described 

above using the JH66 (SEQ ID NO: 13) primer. The DNA sequence of the polymerase 
coding region present in pD323A is listed in SEQ ID NO:28. The corresponding 
amino acid sequence of. the Tne M284(D323A) protein is listed in SEQ ID NO:29. 

f) Construction Of pD389A 

15 pD389A produces a modified form of the Tne polymerase which lacks the first 

283 amino acids from the N-terminus of the full-length protein and contains an amino 
acid substitution at residue 389. At amino acid residue 389, the wild-type aspartic acid 
residue is replaced with alanine. 

To construct pD389A, the PCR was used to amplify two overlapping fragments 

20 independently; the PCR products were combined and the resulting large fragment was 

reamplified. Mutagenesis oligonucleotide JH80 [5 1 -CCTG AAGT ACGcgT AC AAGGT 
TCTTATGG-3' (SEQ ID NO:30); the mutagenic residues are indicated by the use of 
lower case letters] and sequencing primer JH61(SEQ ID NO: 12) were used to prime a 
first PCR to create a 425 bp fragment which incorporates the desired mutation. The 

25 sequencing primers JH66 (SEQ ID NO:13) and M13 reverse (Promega Q5401; SEQ 
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ID N0:31) were used to amplify a 564 bp fragment using pM284 as the template in a 
second PCR. When these two PCR products were combined using the Ml 3 reverse 
and JH61 primers, a 889 bp fragment was made. A 348 bp Bglll fragment was then 
removed from the 889 bp product and was exchanged with the analogous, but, 
non-mutagenic Bglll fragment of pM284. The desired mutants was confirmed by the 
presence of an extra MIul restriction site. All PCR reactions were performed using a 
Perkin-Elmer 480 thermal cycler. 

For generation of the 564 bp product, the PCR was conducted by performing 
15 cycles comprising a denaturation step (95°C for 15 sec) and an annealing/extension 
step (70°C for 1 min). JM284 was used as the template in a reaction containing 1 /xM 
of each of the primers (JH66 and M13 reverse), 1.5 mM MgCl 2 and 3 units 77/ DNA 
polymerase (Promega). 

For generation of the 425 bp product, the PCR was conducted by performing 
20 cycles comprising a denaturation step (95°C for 15 sec), an annealing step (55°C 
for 30 sec; cycles 1-5) or an annealing step (70°C for 15 sec; cycles 6-20) and an 
extension step (70°C for 45 sec). JM284 was used as the template in a reaction 
containing 1 pM of each of the primers (JH80 and JH61), 1.5 mM MgCl 2 and 3 units 
77/ DNA polymerase (Promega). 

For the generation of the 889 bp product, the 564 bp product and the 425 bp 
product were used as the template in a reaction containing 1.5 mM MgCl 2 and 3 units 
77/ DNA polymerase (Promega). The cycling conditions were: denaturation (95°C for 
15 sec) and annealing/extension at 70°C for 1 min; no primers were present for cycles 
1-5. One itM of the M13 reverse primer was present for cycles 5-10 and 1 /xM of the 
Ml 3 reverse and JH61 primers were present in cycles 11-25. 

The DNA sequence of pD389A was obtained as described above using the 
JH61 and JH66 (SEQ ID NOS:12 and 13) primer. The DNA sequence of the 
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polymerase coding region present in pD389A is listed in SEQ ID NO:32. The 
corresponding amino acid sequence of the Tne M284(D389A) protein is listed in SEQ 
ID NO:33. 



g) Construction Of pD323,389A 

5 pD323,389A produces a modified form of the Tne polymerase which lacks the 

first 283 amino acids from the N-terminus of the full-length protein and contains two 
amino acid substitutions at residues 323 and 389. At amino acid residue 323, the 
wild-type aspartic acid residue is replaced with alanine and at amino acid residue 389, 
the wild-type aspartic acid residue is replaced with alanine. 
10 To create pD323,389A, the 215 bp CspASl fragment of pD323A was exchanged 

with the analogous fragment in pD389A bringing the two single mutations into the 
same construct. The desired mutants were selected as having the two restriction site 
changes of the individual mutants (described above). The DNA sequence of 
pD323,389A was obtained as described above using the JH66 (SEQ ID NO: 13) primer. 
15 The DNA sequence of the polymerase coding region present in pD323,389A is listed 

in SEQ ID NO:34. The corresponding amino acid sequence of the Tne 
M284(D323A,D389A) protein is listed in SEQ ID NO:35; this enzyme is referred to as 
the triple mutant Tne polymerase. 

When all of the above point mutant constructs (sections a-g) were induced to 
20 express the modified Tne polymerases in an E. colt host, the crude extracts showed 

polymerase activity comparable with the activity observed using the original pM284 
construct (cultures were grown, induced, lysates prepared and assayed for DNA 
polymerase activity as described in Example 2). 
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h) Construction Of Deletion Mutants pM323 And 
pJM284AB 

Two mutants were created which contained deletions into the putative 3' to 5* 
exonuclease domain of the Tne polymerase gene. It was predicted that these two 
deletion mutations would remove all 3' to 5' exonuclease activity. Surprisingly, when 
these two mutants were expressed in E. coli no polymerase activity was detected. 

i) Construction Of pM323 

pM323 was constructed by digestion of pJM284 with Sinl and EcoRV followed 
by removal of the 3' overhanging {i.e., sticky) ends with Mung Bean nuclease. The 
1.9 kp 5/>iI/£coRV(polished ends) fragment was isolated as described in Example 2. 
JHEX3 (Example 3) was digested with Ncol and Smal and the Ncol overhanging ends 
were made blunt by incubation with the Klenow fragment. The 1.9 kp 
S/«I/£coRV(polished ends) fragment was then blunt end ligated into the prepared 
JHEX3 vector. When the blunted Sinl end ligates to the blunted Ncol end 39 amino 
acids are removed from the N terminus of the protein encoded by the pM284 
construct; this junction also creates an initiating methionine at amino acid position 323 
that is in-frame with the rest of the coding region. The polymerase coding region 
present in the pM323 construct was sequenced to confirm that no undesirable 
mutations were introduced (i.e., insertion of stop codons, frame-shift mutations). The 
DNA sequence of the polymerase coding region present in pM323 is listed in SEQ ID 
NO: 36. The corresponding amino acid sequence of the Tne M323 protein is listed in 
SEQ ID NO:37. 
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ii) Construction Of pJM284AB 

pJM284AB was constructed by digestion of the JM284 construct with BgM 
followed by religation of the large fragment (6.6 kb) back on itself. Digestion of 
JM284 with BgM created two fragments, the smaller being 348 bp which contains 
DNA sequences which encode a portion of the 3' to 5 ' exonuclease domain. The 
pJM284AB construct removed 116 amino acids of the 3' to 5* exonuclease domain 
which correspond to residues 309 through 424 in SEQ ID NO:2. pJM284AB contains 
the same 25 amino acids which encode the N terminus of the protein encoded by 
pM284. This deletion does not change the reading frame for the polymerase domain. 
Restriction digest analysis was conducted on the pJM284AB construct to confirm the 
proper construction was made. The DNA sequence of the polymerase coding region 
present in pJM284AB is listed in SEQ ID NO:38. The corresponding amino acid 
sequence of the Tne M284AB protein is listed in SEQ ID NO:39. 

iii) Expression Of pM323 And pJM284AB 
In EL coli 

When both of these deletion mutants constructs were grown and induced to 
express the Tne polymerase in an E coli host (JM109), no detectable polymerase 
activity was detected. As these two constructs were analyzed by restriction digestion 
or DNA sequencing to insure that no undesirable mutations were introduced, it 
appears, surprisingly, that deletion into the putative 3' exonuclease domain 
(approximately aa residues 291-484) is deleterious either for polymerase activity or 
alternatively for protein stability. These results, in conjunction with those obtained 
using the pM316 construct, show that deletions beyond about position 849 of SEQ ID 
NO:l produce proteins which either are unstable (perhaps due to improper folding) or 
lack polymerase activity. 
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EXAMPLE 5 

Purification Of Tne DNA Polymerases 



In order to produce purified preparations of the wild-type and modified Tne 
polymerases, cells harboring the Tne expression vectors described above were grown, 
induced and the Tne polymerases were isolated. 

a) Growth Of E. coli Cells Harboring Recombinant Tne 
(rTne) Constructs 

E. coli strains containing a desired construct were streaked onto LB plates [10 
g/1 tryptone, 5 g/1 yeast extract, 5 g/1 NaCl, 1 ml/1 IN NaOH and 15 g/1 agar] 
containing 1 0 ^tg/ml tetracycline to isolate single colonies and the plates were grown 
overnight at 37°C A single colony was inoculated into 150 ml LB broth f 10 g/1 
tryptone, 5 g/1 yeast extract, 5 g/l NaCl and I ml/1 IN NaOH] containing 10 ng/ml 
tetracycline (divided into three flasks of 50 ml each); the three flasks were grown with 
shaking overnight at 37°C. The next day, 120 ml of the overnight culture was used to 
seed the fermentation of 6 liters of LB containing 10 ptg/ml tetracycline prewarmed to 
37°C (divided into six flasks of 1 liter each). The large scale culture was grown for 5 
hours at 37°C and then IPTG was added to a final concentration of 1 mM and growth 
was continued for an additional 2 hours at 37°C. The induced cells were harvested by 
centrifugation at 9,000 rpm for 5 minutes in a Beckmann JA10 rotor. Yields were 
typically 2 g cell paste per liter of fermented culture. 
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b) Purification Of vTne DNA Polymerases 

Ten grams of cell paste (prepared as described above) were resuspended in 100 
ml of an ice-cold solution containing 0.25 M NaCl in TEDGT buffer [50 mM 
Tris-HCl (pH 7.3), 1 mM EDTA, 1 mM DTT, 10% glycerol and 0.1% Tween 20] 

5 containing 2.5 mM PMSF. The resuspended cells were lysed by sonication using a 

Vibracell sonicator (Model VCX600; Sonics and Materials, Inc., Danbury, CN). The 
solution was kept ice-cold during sonication by placement of the beaker containing the 
cell suspension in a salted ice bath. Sonication was repeated ten times at 40% output 
for 1 minute with a 2 minute rest between the 1 minute sonication bursts. The cell 

10 lysate was heat treated to denature the bulk of E. coli proteins by incubation of the 

lysate at 68°C to 70°C for 5 to 10 minutes; following heat treatment the lysate was 
placed on ice. 

The following purification steps were performed at 4°C. The chilled lysate was 
centrifuged at 15,000 rpm for 15 minutes in a Beckman JA18 rotor to remove the 

15 heat-denatured proteins. The cleared lysate supernatant was removed and 4 ml of 5% 

polyethylenimine (PEI) was added to the supernatant to precipitate any DNA present in 
the lysate. The lysate was centrifuged at 15,000 rpm for 15 minutes in a Beckman 
JA18 rotor to remove the precipitated DNA. The supernatant was retrieved and solid 
ammonium sulfate was added to 60% saturation to precipitate the DNA polymerase. 

20 After dissolution of the ammonium sulfate, the sample was centrifuged at 15,000 rpm 

for 1 hour in a Beckman JA18 rotor. The supernatant was discarded and the 
precipitated proteins were gathered and dissolved in 10 ml TEDGT buffer. The 
resolubilized protein was then placed in a dialysis membrane tubing having a 12,000 to 
14,000 mw cutoff (Spectra, Houston, TX) and then dialyzed against TEDGT buffer to 

25 remove the ammonium sulfate. 
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The dialyzed protein solution was then loaded onto a 1 5 ml Heparin Sepharose 
(Scientific Protein Laboratory, Waunakee, WI) column (1.7 cm i.d. x 6.5 cm height). 
The column was washed with 150 ml 0.05 M NaCl in TEDGT buffer. A 100 ml salt 
(NaCl) gradient was run over the column to elute the DNA polymerase; the gradient 
5 started at 0.05 M NaCl and ended at 1 M NaCl (all in TEDGT buffer). Fractions (1.5 
ml) were collected and assayed for DNA polymerase activity at 74°C as described 
below. 

Thermostable DNA polymerase activity was assayed by incorporation of 
radiolabeled dTTP into nicked and gapped (i.e., activated) calf thymus DNA (prepared 

10 as described below). One unit of thermostable DNA polymerase is defined as the 

amount of enzyme required to catalyze the incorporation of 10 nmol of dNTP into an 
acid- insoluble form in 30 minutes at 74°C. The reaction conditions comprised: 50 
mM Tris-HCl (pH 9.0 at 25°C), 50 mM NaCl, 10 mM MgCl 2 , 200 pM dATP, 200 /xM 
dCTP, 200 ptM dGTP, 200 fM dTTP and 5 /zCi 3 H-dTTP (Amersham) and 60 fig 

15 activated calf thymus DNA in a 250 final volume. 

The reaction components were assembled at room temperature. Samples 
suspected of containing polymerase activity were added (5 y\ containing 0.05 to 0.5 
units) and the tube was incubated at 74°C; aliquots (50 fi\) were withdrawn at 6, 9, 12 
and 15 minutes and placed immediately into 1.5 ml microcentrifuge tubes containing 

20 0.5 ml of ice-cold 10% TCA on ice. After 10-30 minutes on ice, the entire TCA 

precipitation was filtered through a GF/A filter (Whatmann). The reaction tubes were 
rinsed with 3 volumes of cold 5% TCA and the filters were washed twice with 10 mi 
of ice-cold 5% TCA followed by a rinse with 1 ml of acetone. The filters were dried 
and the radioactivity bound to filters was counted in a scintillation counter. 

25 Total and background counts were determined as follows. For total cpm, 10 /il 

of the reaction mix (without any polymerase added) was spotted onto duplicate GF/A 



- 71 - 



filters and counted. For background counts, 50 /xl of the reaction mix (without any 
polymerase added) was added to duplicate tubes containing 0.5 ml cold 10% TCA and 
the samples were filtered through GF/A filters and washed as described above. 

Activated calf thymus DNA was prepared by dissolving 1 g calf thymus DNA 
5 (#D-151, Sigma, St. Louis, MO) in 400 ml TM buffer [10 mM Tris-HCl (pH 7.3), 5 

mM MgCy. Four hundred microliters of a solution containing 40 units of 
RQl-DNAse (Promega) in TM buffer was added to the DNA solution and incubated at 
37°C for 10 minutes. The DNAse digestion was stopped by heating the DNA solution 
at 68°C for 30 minutes. The activated calf thymus DNA was stored at -20°C until 

10 used. The activated calf thymus DNA was heated to 74°C for 10 minutes and then 

cooled to room temperature before use. 

As shown in Table 2 below, the Tne DNA polymerases generally eluted from 
the Heparin Sepharose column at a salt concentration of between 0.14 M and 0.29 M 
NaCl. Fractions containing the polymerase activity were pooled, placed into dialysis 

15 membrane tubing (as described above) and dialyzed against TEDGT buffer until the 

salt concentration was less than or equal to 0.05 M NaCl as measured by conductivity 
using a conductance meter (Yellow Spring Instrument Co., Yellow Springs, OH). 

The dialyzed polymerase fraction was then loaded onto a 7 ml Cibracron Blue 
3GA (Sigma, St. Louis, MO) column (1.25 cm i.d. x 6 cm height). The Cibracron 

20 Blue 3GA column was washed with 70 ml of 0.05 M NaCl in TEDGT buffer. A 100 

ml salt gradient was run over the column to elute the DNA polymerase; the gradient 
started at 0.05 M NaCl and ended at 1 M NaCl (all in TEDGT). Fractions (1.5 ml) 
were collected and assayed for DNA polymerase activity at 74°C (as described above). 
As shown in Table 2, the Tne DNA polymerases generally eluted from the 

25 Cibracron Blue 3GA column at a salt concentration of between 0.20 M and 0.46 M 

NaCl. Fractions containing the polymerase activity were pooled, placed in dialysis 
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membrane (as described above) and dialyzed against storage buffer [20 mM Tris-HCl 
(pH 8.0), 100 mM KC1, 0.1 mM EDTA, 1 mM DTT, 50% glycerol and 0.5% Tween 
20]. Aliquots of preparations of purified Tne polymerases were electrophoresed on 
SDS-PAGE gels and stained with Coomaisse blue. The purified DNA polymerases 
5 produced by all of the recombinant Tne polymerase constructs were judged to be at 

least 95% pure based on visual inspection of the Coomaisse-stained SDS PAGE gels. 



TABLE 2 

Salt Elution Characteristics For rTne DNA Polymerases 





SEQ ID NO 


Heparin Sepharose 


Cibracron Blue 


rTne DNA 
Polymerase full-length 


2 


0.26 to 0.37 M NaCl 


0.5 to 0.8 M NaCl 


Tne M284 


8 


0.13 to 0.3 M NaCl 


0.05 to 0.3 M NaCl 


Tne M284 (E325D) 


19 


0.15 to 0.28 M NaCl 


0.3 to 0.48 M NaCl 


Tne M284 (D468N) 


26 


0.11 to 0.23 M NaCl 


0.16 to 0.35 M NaCl 


Tne M284 (D323E) 


16 


0.05 to 0.32 M NaCl 


0.11 to 0.4 M NaCl 


Tne M284 (Y464F) 


23 


0.11 to 0.28 M NaCl 


0.24 to 0.4 M NaCl 


Tne M284 
(D323A, D389A) 


35 


0.14 to 0.27 M NaCl 


0.06 to 0.46 M NaCl 



The above results provide methods for the isolation of the full-length and 
modified forms of Tne DNA polymerase in a highly pure form at high yields. 
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EXAMPLE 6 

Purification Of Tne DNA Polymerase From T neapolitana Cells 



The preceding example described the isolation of recombinant Tne DNA 
polymerases from E. coli harboring plasmids which overexpress these enzymes. The 

5 full-length Tne DNA polymerase may also be isolated from T. neapolitana cells 

T neapolitana cells are obtained from the ATCC (ATCC 49049) and grown in 
anaerobic culture in MMS medium containing (per liter): 6.93 g NaCl; 1.75 g 
MgS(V7H 2 0; 1.38 g MgCl 2 -6H 2 0; 0.16 g KC1; 25 mg NaBr; 7.5 mg H 3 B0 3 ; 3.8 mg 
SrCl 2 -6 H 2 0; 0.025 mg KI; 0.38 g CaCl 2 ; 0.5 g KH 2 P0 4 ; 0.5 g Na.S^H.O; 2 mg 

10 (NH 4 ) 2 Ni(S0 4 ) 2 ; 15 ml trace minerals [per liter: 3.0 g MgS0 4 -7H 2 0; 1.5 g nitriloacetic 

acid, 1.0 g NaCl; 0.5 g MnSO/H 2 0; 0.1 g FeS0 4 -7H 2 0; 0.1 g CoCl 2 -6H 2 0; 0.1 g 
CaCl 2 ; 0.1 g ZnS0 4 -7H 2 0; 0.01 g CuS0 4 -5H 2 0; 0.01 g A1K(S0 4 ) 2 -12H 2 0; 0.01 g 
H3BO3 and 0.01 g NajMoO^HjO]; 1 mg resazurin and 5 g starch at a pH of 6.5 
(adjusted with H 2 S0 4 ). For growth on solid medium, 0.8% agar (Difco) was added to 

15 the above medium. 

The cells are grown in a fermentation vessel maintained at 75-80°C under 
anaerobic conditions (i.e., under nitrogen). The cells are grown for approximately 28 
hours (early stationary phase). The cells are collected by centrifugation at 10,000 x g. 
The cell pellet may be frozen at -70°C until used. All of the subsequent operations 
20 should be carried out at 0 to 4°C unless otherwise stated. 

Resuspend about 50 g of frozen Thermotoga neapolitana cells in 100 ml of 
TEDGT [50 mM Tris-HCl (pH 7.3 at 25°C), 1 mM EDTA, 1 mM DTT, 10% 
glycerol, 0.1% Tween 20] containing 2.5 mM PMSF (from 144 mM stock in DMF). 
The thawed and resuspended cells can be lysed in a Aminco French Pressure Cell 
25 (American Instrument Co., Silver Spring, MD, cat. no. FA-073) at 16,000 to 24,000 

psi. This operation should be done twice to ensure adequate lysis. The lysate should be 
diluted by adding another 100 ml TEDGT containing 2.5 mM PMSF and stirring 
gently. 
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PEI (polyethyleneimine) is added to the lysate to precipitate the DNA. The 
exact amount of PEI is determined empirically, but in general 0.2% PEI should be 
adequate to precipitate most of the DNA (greater than 90%). Approximately 10 ml of 
5% PEI is added to precipitate most of the DNA in the lysate. The lysate is 
centrifuged at 15,000 rpm for 15 minutes in a Beckman JA18 rotor to remove the 
precipitated DNA. The supernatant is retrieved and solid ammonium sulfate is added 
to 60% saturation to precipitate the DNA polymerase and other proteins. After the salt 
is dissolved, the sample is centrifuged at 15,000 rpm for 1 hour in a Beckmann JA18 
rotor. The supernatant is discarded and the precipitated protein is gathered and 
dissolved in TEDGT buffer. The resolubilized protein is then placed in a dialysis 
membrane having a 12,000 to 14,000 mw cutoff (Spectra) and then dialyzed against 
TEDGT buffer to remove the ammonium sulfate. 

The dialyzed protein solution is then loaded onto a 60 ml DEAE Sepharose 
(Sigma, St. Louis MO) column (2.5 cm i.d. x 13 cm height). The column is washed 
with 300 ml 0 M NaCl in TEDGT buffer. A 300 ml salt gradient is run over the 
column to elute the DNA polymerase starting at 0 M NaCl and ending at 0.5 M NaCl 
(all in TEDGT buffer). Fractions (5.0 ml) are collected and assayed for DNA 
polymerase activity at 74°C using the protocol described in Example 5. Fractions 
containing the polymerase activity are pooled, placed in dialysis membrane (12,000 to 
14,000 mw cutoff; Spectra) and dialyzed against TEDGT buffer until the salt 
concentration is less than or equal to 0.05 M NaCl as measured by conductivity. 

The pooled polymerase fraction is then loaded onto a 15 ml Heparin Sepharose 
(Scientific Protein Laboratory, Waunakee, WI) column (1.7 cm i.d. x 6.5 cm height). 
The column is washed with 150 ml 0.05 M NaCl in TEDGT buffer. A 100 ml salt 
gradient is run over the column to elute the DNA polymerase starting at 0.05 M NaCl 
and ending at 1 M NaCl (all in TEDGT buffer). Fractions (1.5 ml) are collected and 
assayed for DNA polymerase activity at 74°C using the assay described in Example 5. 
The Tne DNA polymerase should elute between 0.14 M and 0.29 M NaCl (in TEDGT 
buffer). Fractions containing the polymerase activity are pooled, placed in dialysis 
membrane (12,000 to 14,000 mw cutoff; Spectra) and dialyzed against TEDGT buffer 
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until the salt concentration is less than or equal to 0.05 M NaCl as measured by 
conductivity. 

The pooled polymerase fraction is then loaded onto a 15 ml DNA Agarose 
(Pharmacia) column (1.7 cm i.d. x 6.5 cm height). The column is washed with 75 ml 
0 M NaCl in TEDGT buffer. A 100 ml salt gradient is run over the column to elute 
the DNA polymerase starting at 0 M NaCl and ending at 0.5 M NaCl (all in TEDGT 
buffer). Fractions (1.5 ml) are collected and assayed for DNA polymerase activity at 
74°C using the assay described in Example 5. Fractions containing the polymerase 
activity are pooled, placed in dialysis membrane (12,000 to 14,000 mw cutoff; 
Spectra) and dialyzed against TEDGT buffer until the salt concentration is less than or 
equal to 0.05 M NaCl as measured by conductivity. 

The pooled polymerase fraction is then loaded onto a 7 ml Cibracron Blue 3GA 
(Sigma, St. Louis, MO) column (1.25 cm i.d. x 6 cm height). The column is washed 
with 70 ml of 0.05 M NaCl in TEDGT buffer. A 100 ml salt gradient is run over the 
column to elute the DNA polymerase starting at 0.05 M NaCl and ending at 1 M NaCl 
(all in TEDGT buffer). Fractions (1.5 ml) are collected and assayed for DNA 
polymerase activity at 74°C using the assay described in Example 5. The Tne DNA 
polymerase should elute between 0.20 M and 0.46 M NaCl (in TEDGT buffer). 
Fractions containing the polymerase activity are pooled, placed in dialysis membrane 
(12,000 to 14,000 mw cutoff; Spectra) and dialyzed against storage buffer [20 mM 
Tris-HCl (pH 8.0), 100 mM KC1, 0.1 mM EDTA, 1 mM DTT, 50% glycerol, 0.5% 
Tween 20]. 

The purity and approximate molecular weight of the DNA polymerase can be 
assessed by SDS-PAGE gel electrophoresis using a 4-20% gradient Tris-Glycine SDS 
gel (Novex, San Diego, CA). An aliquot of the purified material is mixed with sample 
buffer [63 mM Tris-HCl pH 6.8, 10% glycerol, 2% SDS, 0.0025% Bromphenol Blue] 
and the gel is run using the following running buffer [25 mM Tris-Base, 192 mM 
glycine, 0.1% SDS, pH 8.3]. The gel can be run for 90 minutes at 125 V D.C. until 
the bromphenol blue band is just to the bottom of the gel. The apparent molecular 
weight for Tne DNA polymerase should be about 97,000. Using the activity assay 
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described in Example 5, the number of units of DNA polymerase per microliter can be 
established. By visually assessing the quantity of DNA polymerase on the Coomassie 
stained SDS-PAGE gel compared to the protein standards run in the molecular weight 
marker lane, the specific activity of the DNA polymerase preparation may be 
estimated. The specific activity of the purified Tne DNA polymerase should be 
approximately 100,000 units/mg. 

EXAMPLE 7 

Characterization Of The Full-Length And Modified Tne Polymerases 

The full-length and modified Tne polymerases were assayed for 5' to 3' 
exonuclease activity and 3' to 5' exonuclease activity. 



a) 5' To 3* Exonuclease Assay 

A 5' to 3* exonuclease assay was performed on the Tne M284 DNA 
polymerase (SEQ ID NO:8) present in crude lysates of cells containing the pJM284 
construct (prepared as described in Example 3c) to determine whether any residual 5' 
to 3' exonuclease activity remained in this molecule. A comparison was made of the 
amount of 5' to 3' exonuclease activity present in the following thermostable DNA 
polymerases: Tne M284 (SEQ ID NO:8), the full-length rTne polymerase (SEQ ID 
NO:2), nTaq (Promega) and UlTma (Perkin Elmer). The 5' to 3' exonuclease assay 
was performed as follows. 

End-labeled substrate DNAs were prepared by digestion of pBR322 DNA with 
either EcoKl or EcoRW followed by incubation with r - 32 P-ATP and T4 polynucleotide 
kinase. The 5' to 3' exonuclease assay was performed in a final reaction volume of 
25 p\ and contained IX Taq buffer (10 mM Tris-HCl (pH 9.0 at 25°C), 50 mM KC1, 
0.1% Triton X-100), 1.5 mM MgCl 2 , 200 /xM of each of the four dNTPs, 50 ng of 
labeled substrate DNA and 5 units of the DNA polymerase to be tested (a no enzyme 
control was also conducted; water was used in place of the enzyme). The reaction was 
incubated for 1 hour at 74°C. The reaction was terminated by the addition of 5 /il of 
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0.5 M EDTA. Ten microliters of this mixture were spotted onto 2.3 cm circular DE81 
filters (Whatman). The filters were dried briefly under a heat lamp. The filters were 
washed in 50 ml of 0.5 M sodium phosphate (pH 6.8) twice for 5 minutes/wash to 
remove unincorporated counts. The no enzyme control was used to permit 

5 determination of the total cpm in the sample. The washed filters were dried under a 

heat lamp and then the incorporated cpm and total cpm (no enzyme control filter) were 
determined by liquid scintillation counting. 

The results of this assay showed that nTaq and the full-length Tne (SEQ ID 
NO:2) polymerases contained considerable 5' to 3' exonuclease activity, while the 

10 UlTma and Tne M284 (SEQ ID NO:8) polymerases did not contain detectable levels 
of 5' to 3' exonuclease activity. 



b) 3* To 5' Exonuclease Assay 

3* to 5* exonuclease assays were performed in order to establish how 
mutations in the putative exonuclease domain of the Tne polymerase gene effected this 
15 activity. The assay was initially performed on crude lysates from E. coli cells 

containing a plasmid which produces the Tne M284 polymerase (SEQ ID NO:8). The 
assays were subsequently performed on purified preparations of Tne M284 polymerase 
(SEQ ID NO:8). 

The assay was performed using either a single stranded or double stranded 
20 DNA substrate which contained a 3' end label. The substrate DNA were prepared as 

follows. 

To create the double stranded substrate, Lambda DNA/EcoRI Markers 
(Promega G1721) were 3* end labeled with [a- 32 P]dATP in a final reaction volume of 
50 fi\ containing 10 fig of lambda DNA, IX Buffer A [6 mM Tris-HCl (pH 7.5 at 
25 37°C), 6 mM MgCl 2 , 6 mM NaCl and 1 mM DTT], 5 p\ [<x- 32 P]dATP (3000 

Ci/mmol; Amersham), 5 units Klenow fragment (Promega) and 10 mM of each of the 
four dNTPs. The reaction was incubated for 20 minutes at 37°C. The Klenow 
enzyme was inactivated by heating the mixture at 65°C for 15 min. Unincorporated 
counts were removed by chromatography of the reaction mixture on a Nick™ Column 
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(Pharmacia) according to the manufacturer's instructions. The labeled DNA was 
eluted in a volume of 400 pi. A 10 /xl aliquot of the eluted DNA was counted by 
liquid scintillation counting and the aliquot contained approximately 2 x 10 5 cpm. 

To create the single stranded substrate, a synthetic 74 nucleotide 
oligonucleotide, PM3074 (SEQ ID NO:41) is 3' end labeled with [a- 32 P]dATP in a 
final reaction volume of 10 /xl containing 10 pmoles of the PM3074 oligonucleotide, 
IX TdT Buffer [50 mM Tris-HCl (pH 7.5), 10 mM MgCl 2 , 5 mM DTT and 0.1 mM 
spermidine], 3 pi [a- 32 P]dATP (3000 Ci/mmol; Amersham) and 15 units terminal 
deoxynucleotidyl transferase (TdT) (Promega). The reaction was incubated for 60 
minutes at 37°C. The TdT enzyme was inactivated by heating the mixture at 65°C for 
1 5 min. Unincorporated counts were removed by chromatography of the reaction 
mixture on a Nick™ Column (Pharmacia) according to the manufacturer's instructions. 
The labeled DNA was eluted in a volume of 400 /il. A 10 pi aliquot of the eluted 
DNA was counted by liquid scintillation counting and the aliquot contained 
approximately 1 x 10 5 cpm. 

The 3' to 5' exonuclease assay was performed as follows. In a final volume of 
50 /il, the following components were assembled, 1 or 2 units of the DNA polymerase 
to be assayed, 5 pi of 10 X Toga buffer [10 mM Tris-HCl (pH 8.8 at 25°C), 10 mM 
KG, 0.002% Tween 20] and nuclease-free water (Promega). One tube was also set up 
which did not contain any polymerase {i.e., a no enzyme control). The reaction 
mixtures were prewarmed to the reaction temperature (reactions were performed at 
either 25 °C or 74°C) and 10 pi of either the labeled single stranded or double stranded 
substrates were added to start the assay. Ten microliter fractions were removed at the 
following time points: 2, 4 or 6 minutes. The aliquots were spotted onto 2.3 cm 
circular DE81 filters and processed as described in section a) above. 

The definition of one unit of 3* to 5' exonuclease activity is defined as the 
amount of enzyme required to remove 1 pmol of labeled 3 1 end from the substrate in 
30 minutes. 

In addition to assaying the polymerases for 3' exonuclease activity, all samples 
were also analyzed for DNA polymerase activity using the assay described in Example 
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5(b). In this manner, the ratio of 3' exonuclease activity to polymerase activity could 
be determined for each polymerase. 

When the 3' to 5' exonuclease assay was performed using on crude lysates 
containing the Tne M284 enzyme, the assay was run using only the single stranded 
substrate at 25°C Under these conditions, a comparison was made between purified 
UlTma (a modified form of Tma polymerase which lacks 5' to 3' exonuclease activity) 
and UlTma spiked into a lysate derived from E. coli cells shown to lack 3' to 5' 
exonuclease activity; the spiked sample showed a 22% lower 3' exonuclease: 
polymerase activity ratio (polymerase activity was measured using the assay described 
in Example ). As E. coli lysate alone shows no 3* exonuclease activity, this result 
indicated that results obtained with crude lysates containing the Tne M284 protein 
(SEQ ID NO:8) should underestimate the V to 5' exonuclease levels that would be 
seen using purified Tne polymerase preparations. 

When purified UlTma™ was compared with crude lysates containing the 
full-length (SEQ ID NO:2) or Tne M284 (SEQ ID NO:8) polymerases, the highest 
exonuclease:polymerase activity ratio was seen for the full-length Tne enzyme (1.37), 
then M284 Tne (0.83), then UlTma™ (0.45). These results were unexpected as it was 
not predictable that a deletion in the 5* to 3' exonuclease domain (Tne M284 
polymerase) would result in a 39% decrease in 3* to 5' exonuclease activity. 

Purified Tne M284 polymerase was used in 3' exonuclease assays in 
comparison to purified UlTma™ using both single stranded and double stranded 
substrates at 25 °C and 74°C. Surprisingly, both enzymes displayed the same level of 
3' exonuclease activity on single stranded substrate at 25°C. Results obtained using 
crude lysates containing Tne M284 polymerase suggested that Tne M284 would have a 
higher activity. Both enzymes (purified Tne M284 and UlTma™) gave nearly 
identical results when the 3' exonuclease assay was performed using the double 
stranded substrate at 25°C (and both showed very low activity under these conditions). 
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When the 3' exonuclease assay was performed using a single stranded substrate 
at 74°C, UlTma™ had no activity, whereas purified Tne M284 showed an 
exonuclease: polymerase ratio of 0.32. 

These results demonstrate that purified Tne M284 polymerase and UlTma™ 
5 have significantly different 3* to 5* exonuclease activities. 



EXAMPLE 8 

Characterization Of The Purified Tne Polymerases 



In order to ascertain the characteristics of the full length and modified forms of 
the rTne polymerases, a number of determinations were made as described below. 

10 a) Molecular Weight On SDS-PAGE Gels 

The apparent molecular weight of the full length and modified iTne 
polymerases were determined by SDS-PAGE. An aliquot (10 units) of each purified 
polymerase was mixed with sample buffer [63 mM Tris-HCl pH 6.8, 10% glycerol, 
2% SDS, 0.0025% bromophenol blue] and applied to a precast 4-20% gradient 

15 Tris-Glycine SDS gel (Novex, San Diego, CA). Molecular weight markers (Promega) 
were run on the same gel to allow calculation of the molecular weight of the Tne 
polymerases. The gel was run using the following running buffer [25 mM Tris-Base, 
192 mM glycine, 0.1% SDS, pH 8.3] for 90 minutes at 125 V D.C. until the 
bromphenol blue band was just to the bottom of the gel. 

20 Following electrophoresis, the gel was stained with Coomaisse blue to visualize 

the proteins. The full-length rTne polymerase migrated with an apparent molecular 
weight of 97,000 daltons. Tne M284, Tne M284(D323E), Tne M284(E325D), Tne 
M284(Y464F), Tne M284(D468N), and Tne M284(D323A, D389A) all migrated with 
an apparent molecular weight of 70,000 daltons. 
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b) Specific Activity Of Purified Tne Polymerases 

Preparations of purified full-length xTne and Tne M284, Tne M284(D323E), 
Tne M284(E325D), Tne M284(Y464F), Tne M284(D468N), and Tne M284(D323A, 
D389A) were all found to have a specific activity of 100,000 units/mg when the 
enzymes were purified and DNA polymerase assays were conducted as described in 
Example 5. 

c) 3' To 5* Exonuclease Activity 

Purified preparations of full-length xTne and Tne M284, Tne M284(D323E), 
Tne M284(E325D), Tne M284(Y464F), Tne M284(D468N) and Tne M284(D323A, 
D389A) were assayed for 3' to 5* exonuclease activity. In the same experiment, the 
3' to 5' exonuclease activity of the UITma™ (Perkin Elmer) and nTaq (Promega) 
polymerases were also measured for comparison to the Tne polymerases. 

The 3' exonuclease assay involved the incubation of the above enzymes with a 
3' end-labeled DNA substrate; the loss of radioactivity from the labeled substrate was 
measured after a 10 minute incubation period. 

i) Preparation Of The 3' End-Labeled 
Substrate 

Ten micrograms of lambda phage DNA was digested with 50 u of Mlul 
(Promega) in a 50 y\ reaction volume in IX Buffer D [6 mM Tris-HCL (pH 7.9 at 
37°C), 6 mM MgCl 2 , 150 mM NaCi and 1 mM DTT]. The reaction was incubated for 
3 hours at 37°C. The 3' ends were then filled in using [a- 32 P]-dCTP and unlabelled 
dGTP in a reaction containing 5 u of Klenow Exo- (USB) and IX Buffer D in a 
volume of 100 id; the reaction mixture was incubated for 20 min at 25°C. The 
reaction was terminated by heating the sample to 74C for 15 min. The bulk of the 
unincorporated [a- 32 P]-dCTP was removed by passage of the sample over a Nick™ 
column (Pharmacia) and the labeled DNA was eluted in 400 /xl TE [10 mM Tris-HCl 
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(pH 7.4), 1 mM EDTA], An aliquot (10 ^1) of the eluted DNA was counted in a 
scintillation counter and the aliquot contained approximately 2 x 10 5 cpm. 

ii) 3 1 To 5' Exonuclcasc Assay 
The 3' to 5' exonuclease assay was performed as follows. In a final volume of 
5 50 /xl, the following components were assembled, 1 or 2 units of the DNA polymerase 

to be assayed, 5 jxl of 10 X Toga buffer [10 mM Tris-HCl (pH 8.8 at 25°C), 10 mM 
KG, 0.002% Tween 20] and nuclease-free water (Promega). One tube was also set up 
which did not contain any polymerase (/.«., a no enzyme control). The reaction 
mixtures were prewarmed to the reaction temperature (reactions were performed at 
10 either 25°C or 74°C) and 10 /xl of the labeled double stranded substrates were added to 
start the assay. Ten microliter fractions were removed at the following time points: 2, 
4, 6 and 10 minutes (In a subsequent experiment, aliquots were withdrawn at 30 and 
60 minutes to permit the detection of activity for enzymes showing very low levels of 
activity). The aliquots were spotted onto 2.3 cm circular DE81 filters and processed 
15 as described in Example 7, section a) above. 

The definition of one unit of 3' to 5* exonuclease activity is defined as the 
amount of enzyme required to remove 1 pmol of labeled 3 1 end from the substrate in 
30 minutes. 

In addition to assaying the polymerases for 3' exonuclease activity, all samples 
20 were also analyzed for DNA polymerase activity using the assay described in Example 
5(b). In this manner, the ratio of 3 1 exonuclease activity to polymerase activity could 
be determined for each polymerase. The results are reported as the ratio of V 
exonuclease activity to polymerase activity. The value obtained for the full-length Tne 
polymerase was assigned a value of 100% and all other values are expressed relative to 
25 this value. 
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TABLE 3 

3' Exonuclease Activity 



5 



DNA Polymerase 


EiAU.l vl XVclllVJ 


xTne 


i no + q o 

iuv J- y.\j 


T TIT TM 

UlTma 




Tne M284 


Zo.U X U.Uo 


Tnp M?R4fY464Fl 


i.9 ± 0.2 


Tne M284(D468N) 


0.47 ± 0.02 


Tne M284(D323E) 


0.0 


Tne M284(E325D) 


0.0 


Tne M284(D323A, D389A) 


0.0 


nTaq 


0.0 



The results shown in Table 3 demonstrate that deletions which remove 5' 
exonuclease activity (e.g., Tne M284) also affect 3' exonuclease activity in the Tne 

15 polymerases (compare iTne with Tne M284; activity of Tne M284 is roughly one third 
that seen in vTne). Mutation of amino acid residues suspected of being critical for 3' 
exonuclease activity (by analogy to similar residues in £ coli polymerase I) further 
reduces or eliminates 3' exonuclease. The virtual elimination of 3' exonuclease 
activity seen with the Tne M284(D468N) polymerase was unexpected. The analogous 

20 mutation in the Klenow fragment of DNA polymerase I (D501N) reduces 3' 

exonuclease activity only by 2-fold [Derbyshire et al y EMBO J. 10:17 (1991)]. These 
results underscore the fact that much remains to be learned about structure-function 
relationships and that one cannot predict, with certainty, the effect of a given mutation 
based on analogy to other proteins. 
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d) Thermostability Of The Purified Tne Polymerases 
At 97.5°C 

Thermostability was measured by incubating a DNA polymerase at 97.5°C for 
various amounts of time and measuring the remaining DNA polymerase activity at 
74°C. Samples were withdrawn after 0, 5, 10, 30, 60, 90, and 120 minutes after 
exposure to 97.5°C and DNA polymerase assays were conducted as described in 
Example 5b. The time necessary to reduce the activity by one-half the initial value 
was determined by plotting the remaining activity versus the time of incubation at 
97.5°C. 

The incubation buffer contained 10 mM Tris-HCl (pH 8.8 at 25°C), 10 mM 
KC1, 0.002% Tween 20 and 1.5 mM MgCl 2 and 2-3 units of the enzyme to be tested. 
All enzymes, including commercially available enzymes (e.g., nTaq, UlTma™), were 
assayed using the DNA polymerase assay described in Example 5. The results of the 
thermostability assays are summarized in Table 4 below. 



TABLE 4 

Thermostability of Tne Polymerases 



-:} : IMA !^ ■ ■ 






full-length rTne 


2 


5 


Tne M284 (E325D) 


19 


5 


native Taq 




8 


UlTma™ 




12 


Tne M284(D323E) 


16 


12.5 


Tne M284(Y464F) 


23 


16 


Tne M284 


8 


18 


Tne M284(D323A, 


35 


22 


D389A) 




Tne M284 (D468N) 


26 


66 
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The results shown in Table 4 demonstrate that deletion of the amino terminal 
5' to 3* exonuclease domain of Tne polymerase increases resistance to thermal 
degradation (2.5 to 13.2 fold). While amino-terminal deletions have been shown to 
increase thermostability (i.e., thermal tolerance) of certain modified forms of Taq DNA 
5 polymerase (e.g., the Stoffel fragment) and Tma DNA polymerase, the increase seen is 

generally about 2-3 fold. The increase in thermostability seen by the introduction of 
single or double point mutations into the Tne M284 deletion mutant was unexpected, 
especially the dramatic increase in thermostability caused by the single point mutation 
present in Tne M284(D468N) (SEQ ID NO:26). 
10 The Tne M284(D468N) protein is identical to the Tne M284 protein with the 

exception that an asparagine is substituted for an aspartate at position 468; the carbon 
backbones of these two molecules should be identical, yet there is a greater than 5-fold 
resistance to thermal degradation seen when asparagine is present at position 468. The 
only difference between these two molecules should be a negatively charged carbonyl 
15 group (aspartate) and a neutral amide group (asparagine); both of these groups can 

participate in hydrogen-bonding. Substitutions of solvent-exposed amino acids (e.g., 
aspartate or asparagine) have been shown to have little effect on protein stability or 
structure [Matthews, Ann. Rev. Biochem. 62:139 (1993)], leading to the view that the 
rigid parts of proteins are critical for folding and stability. Based on analogy to the E. 
20 coli DNA polymerase I molecule at position 501, it is expected that position 468 in the 
Tne DNA polymerase is accessible to solvent Furthermore, since Tne M284(D468N) 
lacks the carbonyl group to coordinate a metal ion, it was predicted that a slight 
destabilization (in response to heat) would be seen rather than a dramatic stabilization. 
This data show that a subtle change in the 3' exonuclease active site can dramatically 
25 alter the thermostability (i.e., thermotolerance) of the Tne DNA polymerase. 

The above results demonstrate that several of the modified Tne polymerases 
have superior thermostability; accordingly, when these enzymes are used in PCR and 
other reactions run at elevated temperature, less polymerase activity needs to be used 
as less enzyme is inactivated by exposure to elevated temperature. 
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g) Optimal Temperature For DNA Polymerase Activity 

The DNA polymerase activity of several Tne polymerases and nTaq DNA 
polymerase was measured at various temperatures using the assay described in 
Example 5b (with the exception that the temperature of incubation was varied). The 

5 results are summarized in Table 5. The temperature which gave the highest activity 

for a given enzyme was assigned a value of 100% and all other values given are 
expressed relative to the 100% value. 

The results shown in Table 5 demonstrate that for nTaq DNA polymerase, the 
maximal DNA polymerase activity was present when the reaction was run at 78°C 

10 Optimal temperature for DNA polymerase activity for the full-length rTne DNA 

polymerase was 74°C. Optimal temperature for DNA polymerase activity for the Tne 
M284 and Tne M284(D323A, D389A) DNA polymerases was 75°C. 



TABLES 

Optimal Temperature For Tne Polymerase Activity 



15 





^^^^ 


..... : :Tne:^»4^3^A;;.-v^ 
:: '**D38?Ay^ 


nTaq 


;: •;yrTne::;; :: v ;• 


40 


12 


15 


n.d. 


n.d. 


45 


15 


17 


n.d. 


n.d. 


50 


18 


20 


n.d. 


n.d. 


55 


24 


26 


n.d. 


n.d. 


60 


33 


36 


n.d. 


n.d. 


65 


53 


52 


n.d. 


n.d. 


70 


71 


80 


n.d. 


77 


72 


n.d. 


n.d. 


n.d. 


86 


74 


n.d. 


n.d. 


n.d. 


100 


75 


100 


100 


n.d. 


n.d. 



- 87 - 



TABLE 5 



Optimal Temperature For Tne Polymerase Activity 



Temp. (°C) 


Tne M284 


Tne M284(D323A, 
D389A) 


nTaq 


rTne 


76 


n.d. 


n.d. 


93 


94 


77 


n.d. 


n.d. 


97 


n.d. 


78 


n.d. 


n.d. 


1 f\f\ 
100 


96 


79 


n.d. 


n.a. 


Q7 


n.ci. 


80 


53 


56 


85 


85 


81 


n.d. 


n.d. 


82 


n.d. 


82 


n.d. 


n.d. 


68 


68 


85 


16 


16 


n.d. 


n.d. 


90 


8 


8 


n.d. 


n.d. 



EXAMPLE 9 

The Tne DNA Polymerases Provide Improved 
Enzymes For Use In The Polymerase Chain Reaction 



The use of the full-length and modified forms of Tne polymerase in the PCR 
was examined. The results of the following experiments show that the Tne 
polymerases provide improved enzymes for a variety of PCR applications. 

a) The Modified Tne Polymerases Utilize A Broader 

Range Of Optimal dNTP Concentrations In The PCR 

PCR reactions were conducted using a range of dNTP concentrations to 
determine the optimal range of dNTP concentrations utilized by either the UlTma™ 
DNA polymerase (Perkin Elmer) or the Tne M284 and Tne M284(D323A, D389A) 
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polymerases. The modified Tne polymerases were found to produce high yields of 
amplified product over a much broader range of nucleotide concentrations. The PCRs 
were conducted as follows. 

The three enzyme preparations were assayed for DNA polymerase activity 
5 using the assay in Example 5b so that the same amount of enzyme was used in the 
PCRs. PCRs were performed in a buffer optimized for each type of enzyme; the 
UITma™ and Tne enzymes were both found to give optimal results in the following 
buffer [10 mM Tris-HCl (pH 8.8 at 25°C), 10 mM KC1, 0.002% Tween 20]. All 
reactions contained 1 ng of pGEM-luc (Promega) as the template, 20 pmol of each 
10 primer [the primers used were LME41(SEQ ID NO:42) and LME43(SEQ ID NO:43)], 

1.5 mM MgCl 2 (this concentration was chosen as it was optimal for both the UITma™ 
and the modified Tne enzymes as shown below in section b) and 2.5 units of each 
enzyme. The final reaction volume was 50 pi 

A dilution series was created for the mixture of all four dNTPs ranging from 
15 20 to 200 M M (20, 40, 60, 80, 100, 120, 140, 160 and 200 fiM). The PCRs were 
thermal cycled using the following conditions, an initial denaturation at 96°C for 2 
min, followed by 30 cycles comprising denaturation at 94°C for 30 sec, 
annealing/extension at 65°C for 2 min; following the last cycle the tubes were 
incubated at 65°C for 10 min and then the tubes were incubated at 4 °C. Thermal 
20 cycling was conducted on a Perkin Elmer Thermocycler Model 480. The PCR 
products were resolved on 1% agarose gels, stained with EtBr and quantitated by 
visual inspection of the stained gels. The experiment was repeated a second time and 
similar results were obtained. 

The optimal dNTP concentration for UITma™ was found to be 40 fxM (the 
25 same value as reported to be optimal by Perkin Elmer in the UITma™ product insert); 
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slight levels of product could be detected using the UlTma™ enzyme at a 
concentration of 20 fM dNTPs but not at all at other concentrations tested. The 
optimal range of concentrations of dNTPs for Tne M284(D323A, D389A) was found 
to be from 60 to 200 /xM with slight amounts of product being detected at both 20 
5 and 40 ptM dNTP. The optimal range of concentrations of dNTPS for Tne M284 was 

from 40 /*M to 200 /aM with slight amounts of product being detected at 20 /xM 
dNTP. 

The ability to use a wide range of dNTP concentrations is advantageous. The 
modified Tne polymerases allow PCR users wide flexibility in the design of the 
10 reaction conditions. Because the modified Tne polymerases are not sensitive to small 

variations in dNTP concentration, PCRs using these enzymes are more robust. 
Additionally the ability to use a higher concentration of dNTPs may also allow a 
higher yield of product to be generated in a PCR using the modified Tne polymerases 
as compared to the use of the UlTma™ polymerase. 

15 b) The Tne Polymerases Tolerate A Broader Range Of 

Mg++ Concentrations In PCR 

In side-by-side tests with UlTma™ DNA polymerase and Tne M284 or Tne 
M284(D323A, D389A), the Tne variants were found to produce high yields of 
amplified product over a broader range of magnesium ion concentrations. A dilution 
20 series was created for magnesium (MgCi 2 ) ranging from 0.5 to 5 mM (0.5, 1.0, 1.5, 

2.0, 2.5, 3.0, 4.0, and 5.0 mM). The other components of the PCR were as described 
above in section a) with the exception that a single dNTP concentration was used (40 
lxM dNTP was used for the UlTma™ enzyme and the modified Tne enzymes as this 
value was within the operable range for all three enzymes; it is noted that 40 fM 
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dNTP is not optimal for the Tne M284(D323A, D389A enzyme and that even higher 
product yields would be expected if 60 dNTP were used for this enzyme) and the 
concentration of MgCl 2 was varied. Thermal cycling conditions were as described 
above in section a). The reaction products were treated as described above in 
5 section a). 

The results of these assays showed that the UlTma™ enzyme (Perkin Elmer) 
had an optimal concentration of 1.5 mM MgCl 2 and a great deal of non-specific 
background products were seen on the gels (as a smear). The optimal concentration 
for Tne M284 enzyme was found to be from 1 to 3 mM MgCl 2 and some nonspecific 
10 background products were observed. The optimal concentration for M284(D323A, 
D389A) was from 1 to 3 mM MgCl 2 and very little or no detectable non-specific 
background products were observed. 

The ability of a thermostable enzyme to produce only specific PCR products 
over a wide range of Mg~ concentrations is important for certain PCR applications. 
15 For example, multiplexing PCRs utilize several pairs of primers to amplify several 

different targets in the sample. As each primer pair and target combination will have 
optimum Mg" concentration and this value may vary considerably from primer set to 
primer set, the availability of thermostable polymerases having a wide optimum for 
Mg~ is advantageous. Therefore, the Tne M284 and Tne M284(D323A, D389A) 
20 polymerases provide improved thermostable DNA polymerases. 
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c) Modified Tne Polymerases Produce High PCR 
Product Yields 

A comparison was made between the yield of PCR product obtained using a 
consistent amount of polymerase activity when the following enzymes were used in the 
5 PCR: Tne M284, Tne M284(D323E), Tne M284(E325D), Tne M284(Y464F), Tne 

M284(D468N), Tne M284(D323A, D389A), AmpliTaq (Perkin Elmer), nTaq 
(Promega) and UlTma™ (Perkin Elmer). All enzymes were assayed using the DNA 
polymerase assay described in Example 5b so that the same number of units of 
enzyme were added to the PCRs. 
10 PCRs which used UlTma™ polymerase contained 10 mM Tris-HCl (pH 8.8 at 

25°C), 10 mM KC1, 0.002% Tween 20 and 1.5 mM MgCl 2 . This buffer gives optimal 
PCR results for the UlTma™ enzyme (as reported by Perkin Elmer). For the results 
shown in Table 6, PCRs which used the Tne DNA polymerases contained 10 mM 
Tris-HCl (pH 8.8 at 25°C), 10 mM KC1, 0.002% Tween 20 and 1.5 mM MgCl 2 . This 
15 buffer gives acceptable results for the Tne polymerases. 

For the results shown in Table 7, PCRs which used the Tne DNA polymerases 
contained 10 mM Tris-HCl (pH 9.0 at 25°C), 10 mM KCl, 0.01% Tween 20 and 1.5 
mM MgCl 2 as it was found that the Tne DNA polymerases perform better when the 
pH of the buffer is raised to 9.0 and the amount of detergent (i.e., Tween 20) is 

20 increased to 0.01%. 

PCRs which used the UlTma™ enzyme contained 40 fM dNTPs as this value 
was determined to be optimal for this enzyme (see section a, above). PCRs which 
used the modified Tne enzymes contained 140 /xM dNTPs as this value was 
determined to be optimal for these enzymes (see section a, above). 
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PCRs which used nTaq or AmpiTaq polymerase contained 10 mM Tris-HCl 
(pH 9.0 at 25°C), 50 mM KC1, 1.5 mM MgCl 2 , 0.1% Triton X-100; this buffer gives 
optimal PCR results for the nTaq and AmpHTaq polymerases. PCRs which used the 
nTaq and AmpHTaq enzymes contained 200 fxM dNTPs as this value is reported to be 
5 optimal for these enzymes. 

A single template, pGEM-luc was tested in conjunction with two different 
primer pairs. The primer pair comprising LME41 (SEQ ID NO:42) and LME45 (SEQ 
ID NO:44) amplifies a 500 bp target on the template. The primer pair comprising 
LME41 (SEQ ID NO:42) and LME43 (SEQ ID NO:43) amplifies a 1.5 kb target on 

10 the template. 

All PCR reactions contained 3.0 units of the enzyme to be tested, 2 ng of 
pGEM-luc (Promega) as the template, 50 pmol of each primer pair, 1.5 mM MgCl 2 
(this concentration was chosen as it was determined to be optimal for all three types of 
enzyme tested using this template and these primer pairs). The final reaction volume 

15 was 100 til 

The PCRs which amplified the 500 bp target were thermal cycled using the 
following conditions, an initial denaturation at 96°C for 2 min, followed by 25 cycles 
comprising denaturation at 95°C for 30 sec, annealing at 65°C for 30 sec and 
extension at 72°C for 1 min; following the last cycle the tubes were incubated at 4°C. 

20 The PCRs which amplified the 1500 bp target were thermal cycled using the 

following conditions, an initial denaturation at 94°C for 2 min, followed by 30 cycles 
comprising denaturation at 94°C for 1 min and annealing/extension at 65°C for 2 min; 
following the last cycle, the tubes were incubated at 68°C for 10 min and then the 
tubes were brought to 4°C. 
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Thermal cycling was conducted on a Perkin Elmer Thermocycler Model 480. 
Each reaction was run in duplicate. The PGR products were resolved by 
electrophoresis through 1% agarose gels, followed by staining with EtBr and 
quantitated by scanning the gels with a fluoroimager (Molecular Dyamics); product 

5 yields were quantitated using ImageQuant software (Molecular Dynamics). The results 
(average of the duplicates) are shown below in Tables 6 and 7. The enzyme which 
gave the highest yield was assigned a value of 100% and all other values are expressed 
as a percentage of the maximal yield. 

The results shown above in Tables 6 and 7 demonstrate the following. 

10 Regardless of the target size, the modified Tne enzymes produced higher yields as 

compared to the UITma™ enzyme (2.5 to 6.7 fold more product). The modified Tne 
enzymes also performed better than nTaq when the 500 bp target was used (about 2 
fold more product) and better than the recombinant Taq polymerase (AmpliTaq) when 
the L5 kb target was used (about 30% more product). 

1 5 Thus, to produce the same amount of PCR product, fewer units of modified 

Tne polymerases are required compared to the use of the UlTma™, nTaq and 
AmpliTaq enzymes. These modified Tne polymerases allow the user to use smaller 
amounts of enzyme in the PCR and therefore will provide significant cost savings in 
PCR applications. 
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TABLE 6 

Product Yields For Amplification Of A 500 bp Target 



Enzyme 


% Yield 


nTaq 


59 


UlTma™ 


16 


Tne M284 


100 


Tne M284(D323E) 


89 


Tne M284(E325D) 


87 


Tne M284(Y464F) 


91 


Tne M284(D468N) 


89 


Tne M284(D323A, D389A) 


91 



TABLE 7 

Product Yields For Amplification Of A 1.5 kb Target 



; -Enzyme 




AmpiiTaq 


67 


UlTma ™ 


40 


Tne M284 


81 


Tne M284 (D323E) 


70 


Tne M284 (E325D) 


86 


Tne M284 (Y464F) 


93 


Tne M284 (D468N) 


100 


Tne M284 (D323A, D389A) 


87 
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EXAMPLE 10 

Thermal Cycle Sequencing Using Modified Tne Polymerases 



The modified Tne polymerases were further characterized by examining their 
ability to be used in thermal cycle sequencing reactions. Initial sequencing reactions 

5 were performed using with the Tne M284(D323E) and Tne M284(E325D) mutants 

(SEQ ID NOS:16 and 19, respectively) using the buffer and dideoxy mixes from the 
fmol® DNA Sequencing System kit (Promega) according to the manufacturer's 
instructions. The buffer and concentrations of dideoxy and deoxyribonucieotides in 
this kit were optimized for use with sTaq DNA polymerase (Promega; sTaq is a 

10 modified form of Tag DNA polymerase which has very low 5' to 3' exonuclease 
activity). 

When the sequencing reactions were performed using purified preparations of 
either Tne M284(D323E) (SEQ ID NO:16) or Tne M284(E325D) (SEQ ID NO:19), 
the resulting reaction products were very short and uneven and created light bands on 
15 the sequencing gel in every position and in every lane. The light bands in every 

position were consistent with the presence of residual 3' to 5' exonuclease activity in 
the modified Tne polymerases. 

Thermal cycle sequencing reactions were then performed using the triple 
mutant Tne M284(D323A, D389A) (SEQ ID NO:35). All thermal cycle sequencing 
20 was carried out using a Perkin-Elmer 9600 thermal cycler in conjunction with the 
fmol® DNA Sequencing System reagents (Promega). The template used in the 
sequencing reactions was the Tne clone pE325D (Example 4b) and the 5' labeled 
primer was the JH66 primer (SEQ ID NO: 12). The reactions were run at 95°C for 15 
seconds and 70°C for 60 seconds for 30 cycles. 
25 The results of this experiment showed that the use of the triple mutant enzyme 

(SEQ ID NO:35) eliminated the light bands in every position which were seen with 
Tne M284(D323E) and Tne M284(E325D); however the sequence ladders were still 
shifted toward short extension products. These short intense reads were indicative of a 
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DNA polymerase having a higher affinity for the dideoxynucleotides than Taq DNA 
polymerase. 

To examine if the triple mutant Tne enzyme (SEQ ID NO:35) had a higher 
affinity for dideoxynucleotides, the sequencing reactions were repeated using a lower 
ratio of dNTPs to ddNTPs in the sequencing reaction mixtures (all other conditions 
remained the same). Table 8 lists the 3X mixes used for sTaq and Tne M284(D323A, 
D389A) polymerases in this experiment. The use of lower concentration of 
dideoxynucleotides in the sequencing reaction run using the triple mutant Tne enzyme 
(SEQ ID NO:35) produced reads as long as those obtained using sTaq. 

TABLE 8 





sTaq 


Tne M284, DI 


123A, D389A 






ddNTPs 


dNTPs 


ddNTPs 


G mix 


20 /iM 


30 /iM 


20 /xM 


30 /iM 


A mix 


20 /iM 


350 /iM 


20 /iM ^ 


75 /iM 


T mix 


20 /xM 


600 /iM 


40 /iM 


40 /iM 


C mix 


20 iM 


200 M M 


40 /iM 


20 /iM 



EXAMPLE 11 

Optimization Of Nucleotide Mixtures For Thermal Cycle 
Sequencing Using Tne M284(D323A, D389A) Polymerase 



The results discussed in Example 10 demonstrated that the Tne M284(D323A, 
D389A) polymerase (SEQ ID NO:35) has a higher affinity for dideoxynucleotides than 
does sTaq polymerase. Further testing with various concentrations of dideoxy- and 
deoxynucleotides, including modified dNTPs, in sequencing reactions was performed 
to ascertain the optimal concentration for these reagents. The optimized Tne 
dNTP/ddNTP mixes are detailed in Table 9 below. The values reported in Table 9 
represent 3X mixtures; these mixtures are diluted 3-fold in the final reaction mixture 
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as described below. To obtain the final concentration of dNTPs and ddNTPs in the 
reactions, the values in Table 9 are divided by 3. 

TABLE 9 



Optimized Nucleotide Mix Formulation for Tne Polymerase 



5 





Mix 


/V INUUlCOLlUe 
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Mix 


v iNucieoiiue 
Mix 


Component 


ddGTP 


20/xM 








ddATP 




50/iM 






ddTTP 






75fxM 




ddCTP 








25\iU 


7-deaza dGTP 


30uM 


30uM 


30uM 


30uM 


dATP 


30uM 


30uM 


30uM 


30uM 


dTTP 


30uM 


30uM 


30uM 


30uM 


dCTP 


30fiM 


30uM 


30uM 


30u-M 



To illustrate the fact that the Tne M284(D323A,D389A) enzyme has a higher 
1 5 affinity for ddNTPs as compared to sTaq, the following thermal cycle sequencing 

reactions were performed. The template used was pGEM-3Zf(+) (Promega) and the 
primer was the y- 32 P end labeled pUC/M13 Forward Primer (Promega; SEQ ID 
NO:40). Reactions using sTaq polymerase or the Tne M284(D323A, D389A) 
polymerase were performed using dNTP/ddNTP mixes optimized for sTaq as provided 
20 in the fmol® DNA Sequencing System. 

The extension products of thermal cycle sequencing reactions performed using 
either sTaq or the Tne M284(D323A, D389A) polymerase were resolved by 
electrophoresis on a 6% denaturing polyacrylamide gel; following electrophoresis, the 
gel was exposed to X-ray film. The resulting autoradiograph is shown in Figure 5, 
25 Panel A. 
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In Figure 5, Panel A, lanes 1-4 contain reaction products generated using sTaq 
and lanes 5-8 contain reaction products generated using the Tne M284(D323A, 
D389A) polymerase. In each set of four lanes, reactions run in the presence of 
ddGTP, ddATP, ddTTP and ddCTP were loaded left to right. 

The results shown in Figure 5, Panel A show that the sequence ladders 
generated using the modified Tne polymerase were shifted toward short extension 
products. These short intense reads were indicative of a DNA polymerase having a 
higher affinity for the dideoxynucleotides than Taq DNA polymerase. 

Sequencing reactions were then performed using sTaq in conjunction with 
dNTP/ddNTP mixes optimized for sTaq as provided in the fmol® DNA Sequencing 
System or the Tne M284(D323A, D389A) polymerase and the dNTP/ddNTP mixes 
shown in Table 9. The template used was pGEM-3Zf(+) (Promega) and the primer 
was the y- 32 P end labeled pUC/M13 Forward Primer (Promega; SEQ ID NO:40). An 
autoradiograph of reaction products is shown in Figure 5, Panel B. 

In Figure 5, Panel B, lanes 1-4 contain reaction products generated using sTaq 
and sequencing mixes optimized for sTaq and lanes 5-8 contain reaction products 
generated using the Tne M284(D323A, D389A) polymerase and sequencing mixes 
shown in Table 9 for the Tne M284(D323A, D389A) polymerase. In each set of four 
lanes, reactions run in the presence of ddGTP, ddATP, ddTTP and ddCTP were loaded 
left to right. 

The results shown in Figure 5, Panel B demonstrate that the Tne M284(D323A, 
D389A) polymerase has a higher affinity for ddNTPs than does sTaq polymerase and 
therefore lower concentrations of ddNTPs must be used in the sequencing reactions. 

Using the optimized mixes shown in Table 9, the Tne M284(D323A,D389A) 
enzyme was compared with Sequencing Grade Taq (sTaq; Promega) for its ability to 
sequence three different templates. sTaq (Promega) was used in conjunction with the 
fmol DNA Sequencing System (Promega) and all protocols were followed as per the 
instructions. The Tne M284(D323A,D389A) polymerase was used in conjunction with 
the fmol® DNA Sequencing System kit (Promega), except that the dNTP's/ddNTP's 
mixes used were the Tne optimized mixes shown in Table 9. 
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The following three DNA templates used in the indicated amounts in the 
thermal cycle sequencing reactions. Forty femtomoles of pGEM-3Zf(+) (Promega) 
was sequenced using the y- 32 P end labeled pUC/M13 Forward Primer (Promega; SEQ 
ID NO:40). Four femtomoles of a 500 bp PCR fragment was sequenced using the 
gamma 32P end labeled LME-28 primer (5'- TAATACGACTCACTATAGGGC 
GAAT-3' (SEQ ID NO:47). Four femtomoles of Xgtll phage DNA (Promega) was 
sequenced using a ?- 32 P end labeled Xgtll Forward Primer (Promega). 

The 500 bp PCR product used as template was generated by amplification of 
the template pGEM-luc with primers LME41 (SEQ ID NO:42) and LME45 (SEQ ID 
NO:44) as described in Example 9. 

The thermal profile of all sequencing reactions was 95°C for 2 min, then 30 
cycles of 95°C for 30 sec, 70°C for 60 sec; following the last cycle, the reactions were 
brought to 4°C. The thermal cycling was performed using a Perkin-Elmer 480 
Thermal Cycler. The extension products were separated on a 6% denaturing 
polyacrylamide gel; following electrophoresis, the gel was exposed to X-ray film. The 
resulting autoradiographs are shown in Figure 6. 

In Figure 6, three panels are shown. Panel A contains the extension products 
generated using pGEM-3Zf(+) as the template; Panel B contains the extension products 
generated using the 500 bp PCR product as the template; Panel C contains the 
extension products generated using Xgtl 1 phage DNA as the template. In all three 
panels, eight lanes are shown. The first four lanes correspond to extension products 
generated using sTaq polymerase; the last four lanes correspond to extension products 
generated using the Tne M284(D323 A, D389A) polymerase. In each set of four lanes, 
the termination mixes were used in the following order (left to right): G, A, T and C 
The results shown in Figure 6 demonstrated that theTne M284(D323A, D389A) 
polymerase produced sequencing ladders which were virtually identical in terms of 
quality and quantity to those produced using sTaq (Promega); both enzymes provided 
the correct DNA sequence for each template. While both enzymes produced similar 
results in thermal cycle sequencing reactions, the Tne M284(D323A, D389A) 
polymerase required lower concentrations of dideoxynucleotides. Thus, the use of the 
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use of the Tne M284(D323A, D389A) polymerase will result in considerable cost 
savings for thermal cycle sequencing applications. 

The optimal ddNTP concentrations shown in Table 9 for the Tne 
M284(D323A, D389A) polymerase (SEQ ID NO:35) as compared to the optimal 
concentrations for sTaq (Table 8) demonstrate that the modified Tne polymerase has a 
greater affinity four all four ddNTPs. In particular this modified Tne polymerase 
requires 8-fold less ddTTP or ddCTP, 7-fold less ddATP and 1.5-fold less ddGTP than 
does sTaq in thermal cycle sequencing reactions. As dideoxynucleotides are an 
expensive component of the sequencing reaction mixtures, the use of the Tne 
M284(D323A, D389A) polymerase (SEQ ID NO:35) in place of enzymes such as Tag 
polymerase which have lower affinities for ddNTPS will result in considerable cost 
savings. These results also demonstrate that the Tne M284(D323A, D389A) enzyme 
can utilize the nucleotide analog 7-deaza dGTP which is used to resolve band 
compression artifacts generated when sequencing G+C-rich regions of DNA. 

EXAMPLE 12 

Preferred Sequencing Protocol Using Tne M284(D323A, D389A) Polymerase 

The preferred sequencing protocol uses a thermal cycling format. A detectable 
signal may be generated using either an end radiolabeled primer or a radiolabeled 
dNTP that is incorporated into the extension products. 

a) Sequencing Protocol Using An End-Labeled Primer 
i) Primer Radiolabeling Reaction 

To generate a radiolabeled primer for use in the sequencing reaction, the 
following reaction components are assembled in a 0.5 ml microcentrifuge tube: 10 
pmol of the desired sequencing primer; 10 pmol of y-labeled ATP (see Table 10 for 
amount); 1 pi of 10X T4 polynucleotide kinase Buffer [500 mM Tris-HCl (pH 7.5); 
100 mM MgCl 2 ; 50 mM DTT; 1.0 mM spermidine]; 5 units T4 polynucleotide kinase 
and sterile H 2 0 to a final volume of 10 M l. The reaction mixture is incubated at 37°C 
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for 10-30 min (if end-labeling is to be performed using [y- 35 S]ATP, 20 units of 
polynucleotide kinase are used and the reaction is incubated for 4 hours at 37°C). The 
reaction is then terminated by inactivation of the kinase by incubation at 90°C for 2 
minutes. The tube is then briefly spun in a microcentrifuge to collect any 
5 condensation. The labeled primer may be used directly in the sequencing reaction 

without further purification. 

TABLE 10 

Amount of Radiolabeled Nucleotide Needed To Equal 10 pmol 
[y- 32 P]ATP: 3.0/xl of 3,000Ci/mmol, 10/xCi/^l 

10 5.0^1 of 5,000Ci/mmol, 10/xCi/^l 

0.5/il of 6,000Ci/mmol, 1 35^Ci/^l 

[y- 35 S]ATP: l.V of l,326Ci/mmoi, 10/xCi//xl 

ii) Extension/Termination Reactions 

For each set of sequencing reactions, label four 0.5 ml microcentrifuge tubes 
15 (G, A, T, C). Add 2 yX of the appropriate 3X dNTP/ddNTP Mix to each tube (see 

Table 9 for components of the 3X mixes). Cap the tubes and store on ice or at 4°C 
until needed. For each set of four sequencing reactions, mix the following reagents in 
a microcentrifuge tube: 4-40 fmol of template DNA (see Table 11 below for 
recommended amounts); 5jd fool® Sequencing 5X Buffer [250 pM Tris-HCl (pH 9.0 
20 at 25°C), 10 mM MgClJ; 1.5 /xl labeled primer (1.5pmol); sterile H 2 0 to a final 
volume of 16 /xl (this comprises the primer/template mix). 

Add L0 /xl of Tne M284 (D323A,D389A) DNA Polymerase (5u//d) to the 
primer/template mix. Mix briefly by pipetting up and down (this comprises the 
enzyme/primer/template mix). Add 4 y\ of the enzyme/primer/template mix to the 
25 inside wall of each tube containing d/ddNTP mix. Add one drop (approximately 20 

pX) of mineral oil to each tube and briefly spin in a microcentrifuge. Place the 
reaction tubes in a thermal cycler that has been preheated to 95°C and start the cycling 
program. 
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TABLE 11 

Recommended Amounts Of Template DNA (ng) for End-Labeled Primer Protocol 

Template Length ng of Template 

200bp (PCR product) 0.5ng (4fmol) 

3,000-5,000bp (supercoiled plasmid DNA) lOOng (40fmol) 

48,000bp (lambda DNA) 1 30ng (4fmol) 

When the primer used is less than 24 nucleotides in length or has a 
G+C-content less than 50%, the following cycling profile is used: 95°C for 2 minutes 
followed by 30 cycles of 95°C for 30 seconds (denaturation); 42°C for 30 seconds 
(annealing); 70°C for 1 minute (extension); the tubes are then brought to 4°C. 

When the primer used is greater than or equal to 24 nucleotides in length or 
when shorter primers having a G+C-content greater than or equal to 50%, the 
following cycling profile is used: 95°C for 2 minutes followed by 30 cycles of 95°C 
for 30 seconds (denaturation); 70°C for 30 seconds(annealing/extension); the tubes are 
then brought to 4°C. 

After the thermocycling program has been completed, add 3 jil of finof® 
Sequencing Stop Solution [10 mM NaOH; 95% formamide; 0.05% bromophenol blue; 
0.05% xylene cyanol] to the inside wall of each tube. Briefly spin the tubes in a 
microcentrifuge to terminate the reactions. Heat the reactions at 70°C for 2 minutes 
immediately before loading onto a sequencing gel. Load 2.5-3.0 pi of each reaction 
on the gel. 

b) Sequencing Protocol Using Direct Incorporation 
i) Extension/Termination Reactions 

The following procedure requires about 500 fmol of plasmid templates and 
about 40 fmol of PCR product. The end-labeled primer protocol (section a) is 
recommended for PCR templates. This procedure is not recommended for the 
sequencing of lambda templates. 
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For each set of sequencing reactions, label four 0.5 ml microcentrifuge tubes 
(G, A, T, C). Add 2 y\ of the appropriate d/ddNTP Mix to each tube. Cap the tubes 
and store on ice or at 4°C until needed. For each set of four sequencing reactions, 
mix the following reagents in a microcentrifuge tube: 500 fmol template DNA 
(approx. 1 y.g of a 3-5 kb template); 3.0 pmol primer (approx. 25 ng of a 24 mer); 
0.5/d [a- 35 S]dATP (>l,000Ci/mmol, 10/xCi//xl) or [a- 32 P]dATP (800Ci/mmol, 
lO^Ci/pd); 5 /xl finoP Sequencing 5X Buffer and sterile H 2 0 to final volume of 16/xl. 

Add 1.0/il of Tne M284 (D323A,D389A) DNA Polymerase (5u//xl) to the 
primer/template mix. Mix briefly by pipetting up and down. Add 4/d of the 
enzyme/primer/template mix to the inside wall of each tube containing d/ddNTP Mix. 
Add one drop (approximately 20^1) of mineral oil to each tube and briefly spin in a 
microcentrifuge. Place the reaction tubes in a thermal cycler that has been preheated 
to 95°C and start the cycling program. The cycling profile chosen depends upon the 
characteristics of the primer used; see section a(i) above. 

After the thermocycling program has been completed, add 3 y\ of ftnoP 
Sequencing Stop Solution to the inside wall of each tube. Briefly spin in a 
microcentrifuge to terminate the reactions. Heat the reactions at 70°C for 2 minutes 
immediately before loading on a sequencing gel. Load 2.5-3.0 p\ of each reaction on 
the gel. 

EXAMPLES X > 
Use Of Tne M284(D323A, D389A) 
Polymerase In Sanger Sequencing Protocols 

The above examples described the use of the Tne M284(D323A, D389A) 
polymerase in thermal cycle sequencing protocols. The Tne M284(D323A, D389A) 
Polymerase may also be used in traditional Sanger sequencing protocols. 

If a double stranded DNA template is used, the template is first denatured using 
alkali as follows. Four micrograms (approximately 2 pmol) of supercoiled plasmid 
DNA is added to a microcentrifuge tube and deionized H 2 0 is added to a final volume 
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of 18 pi Two microliters of 2 M NaOH, 2 mM EDTA is added and the mixture is 
incubated for 5 minutes at room temperature. To neutralize the reaction, add 8^1 of 
5M ammonium acetate, pH 7.5, and vortex. Add 1 12 p\ of 100% ethanol and vortex. 
Centrifuge the tube for 10 minutes at top speed in a microcentrifuge. Decant the 
supernatant. Wash the pellet with 1ml of 70% ethanol and centrifuge for 1 minute. 
Remove the supernatant and dry the pellet. Resuspend the dried pellet in 18/xl of 
distilled water for sequencing. Proceed to either section i) or ii) depending on whether 
an end-labeled primer is employed or whether radiolabeled nucleotides are employed 
in the sequencing reaction. 

a) Sequencing Protocol Using An End-Labeled Primer 

The primer is end labeled using the protocol described in Example 1 1 (a)(i). 
The template and primer are annealed as follows. When a single-stranded DNA 
template is employed the following reaction is used. For each set of four sequencing 
reactions, mix the following reagents in a microcentrifuge tube: 0.8 pmol ssDNA 
(approx. 2 M g of an M13 template); 5.0 y\ of 5X Tag DNA Polymerase [250 mM 
Tris-HCL (pH 9.0 at 25°C]; 50 mM MgClJ; 1.0 fi\ labeled primer (lpmol); sterile 
dH 2 0 to a final volume of 25 fil Incubate at 37°C for 10 minutes. During the 
incubation, prepare the extension/termination reaction tubes as described in section c) 
below. 

When a double-stranded plasmid is used as the template, the following reaction 
is used. For each set of four sequencing reactions, mix the following reagents in a 
microcentrifuge tube: 1.6 pmol denatured plasmid dsDNA (approx. 4 fig of a 3-5kb 
template); 5.0 fi\ Taq DNA Polymerase 5X Buffer; 2.0 y\ labeled primer (2pmol) and 
sterile dH 2 0 to a final volume of 25 fxl Incubate at 37°C for 10 minutes. During the 
incubation, prepare the extension/termination reaction tubes as described in section c) 
below. 
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b) Extension/Termination Reactions 

For each set of sequencing reactions, label four microcentrifuge tubes (G,A,T 
and C) and add 1 /xl of the 8X Tne optimized d/ddNTP Mix (see Table 12 for 
components of the 8X mixes) to each tube. Cap the tubes and store on ice or at 4°C 
until needed. Add 1 /xl of Tne M284 (D323A,D389A)DNA Polymerase (5 u//xl) to the 
annealed primer/template mix (prepared as described above) and mix briefly by 

pipetting up and down. 

Add 6 /xl of the enzyme/primer/template mix to each of the four tubes 
containing the d/ddNTP Mixes. Mix briefly by pipetting up and down. A brief spin 
may be needed to ensure that no liquid is left on the tube walls. Incubate at 70°C for 
15 minutes. Add 4 /xl of Stop Solution to each tube and set at room temperature. 
Heat the reactions to >70°C for 2-5 minutes before loading the sequencing gel. Load 
2.5-3.0 /xl of each reaction on the gel (6% denaturing polyacrylamide). Following 
electrophoresis of the sequencing gel, the gel is exposed to X-ray film to generate an 
autoradiograph. 

If the extension products seen on the autoradiograph are too short, the ddNTP 
concentrations should be lowered and conversely if the extension products are all 
skewed to high molecular weight products, the ddNTP concentrations should be raised. 

TABLE 12 
8X Nucleotide Mix Formulation for Sanger 
Sequencing Using Tne M284(D323A, D389A) Polymerase 





G 


A 


T 


C 


Nucleotide 


Nucleotide 


Nucleotide 


Nucleotide 


Nucleotide 


Component 


Mix 


Mix 


Mix 


Mix 


ddGTP 


20/xM 








ddATP 




50/xM 






ddTTP 






75/xM 




ddCTP 








25/xM 


7-deaza dGTP 
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250/xM 


250/xM 
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250/xM 
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c) Sequencing Protocol Using Direct Incorporation 
i) Annealing The Template And Primer 

When a single-stranded DNA template is employed the following reaction is 
used to anneal the template and primer. For each set of four sequencing reactions, 
mix the following reagents in a microcentrifuge tube: 0.8 pmol ssDNA (approx. 2pg of 
an M13 template); 1.0 pmol primer (approx. 8ng of a 24mer); 5.0 pi Taq DNA 
Polymerase 5X Buffer; 2.0 pi Extension/Labeling Mix [7.5 pM each of dGTP, dTTP 
and dCTP];sterile dH 2 0 to a final volume of 25 pi. Incubate at 37°C for 10 minutes. 
During the incubation, prepare the nucleotide tubes for the termination reaction as 
described in section iii, below. 

When a double-stranded DNA template is employed the following reaction is 
used to anneal the template and primer. For each set of four sequencing reactions, 
mix the following reagents in a microcentrifuge tube: 1.6pmol denatured plasmid 
dsDNA (approx. 4pg of a 3-5kb template); 2pmol primer (approx. 16ng of a 24mer); 
5.0pl Taq DNA Polymerase 5X Buffer; 2.0 pi Extertfion/Labeling Mix; sterile dH 2 0 to 
a final volume of 25 pi. Incubate at 37°C for 10 minutes. During the incubation, 
prepare the nucleotide tubes for the termination reaction as described in section iii, 
below. 

ii) Extension/Labeling Reaction 
Add 0.5 pi of [o>"S]dATP (l.OOOCi/mmol, approximately lOpCi/pl) or 0.5 pi 
of [a-"P]dATP (800Ci/mmol, approximately lOpCi/pl) to the annealed 
primer/template mixture. Add 1 pi of Tne M284 (D323A.D389A) DNA Polymerase 
(5u/pl) and mix briefly by pipetting up and down. Incubate at 37°C for 5 minutes. 

iii) Termination Reaction 

For each set of sequencing reactions, label four microcentrifuge tubes 
(G,A,T,C) and add 1 pi of the 8X Tne optimized d/ddNTP Mix to each tube (see 
Table 12 above). Store on ice or at 4°C until just before completion of the 
extension/labeling reaction. When the extension/labeling reaction is complete, aliquot 
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6 fi\ to each tube (G,A,T,C) containing d/ddNTP Mix. Mix briefly by pipetting up 
and down. A brief spin may be needed to ensure that no liquid is left on the tube 
walls. Incubate at 70°C for 15 minutes. Add 4 pi of Stop Solution to each tube and 
store at -20°C. Heat the reactions to >70°C for 2-5 minutes immediately before 
loading on a sequencing gel. Load 2.5-3.0 /xl of each reaction on the gel. Following 
electrophoresis of the sequencing gel, the gel is exposed to X-ray film to generate an 
autoradiograph. 

If the extension products seen on the autoradiograph are too short, the ddNTP 
concentrations should be lowered and conversely if the extension products are all 
skewed to high molecular weight products, the ddNTP concentrations should be raise 



EXAMPLE 12 

Fidelity Of Tne DNA Polymerases 



The fidelity of the Tne polymerases (full-length and modified forms) is 
measured using a PCR fidelity assay. This assay is based on the amplification, 
circularization, and transformation of the pUC19 derivative pLACIQ, which contains 
functional /acl q allele [Frey and Suppmann, Biochemica 2:8 (1995)]. PCR-derived 
mutations in lad result in a derepression of the expression of lacZa and subsequent 
formation of a functional p-galactosidase enzyme, which can be easily detected on 
X-Gal indicator plates. 

a) Construction Of pLACIQ 

The truncated lad gene present in pUC19 is replaced by a functional copy of 
/acl q . pUC19 (GibcoBRL) is digested with Pvull and Afllll and the 180 bp 
PvuW-AfllW fragment of P UC19 is replaced by a 1 189 bp DNA fragment encoding 
lad\ The 1 189 bp /acl q fragment is created by PCR amplification of residues 
2972-4142 of pTrc 99 A (Pharmacia). The following primer pair is used in the PCI 
5 * -CATGCCATGGCATGCATTT ACGTTG ACACCA-3 * (SEQ ID NO:48) and 
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5 > -TCCCCCGGGTTGCGCTCACTGCCCGCTTTCCAGT-3 , (SEQ ID NO:49). The 
oligonucleotide of SEQ ID NO:48 contains a Ncol recognition site and the 
oligonucleotide of SEQ ID NO:49 contains a Smal recognition site. The PCR is 
performed using 18 cycles of denaturation at 94°C for 30 sec; annealing at 57°C for 

5 30 sec and extension at 72°C for 4 min. The PCR is performed using Pfu DNA 

polymerase (Stratagene) in the buffer recommended by the manufacturer. The PCR 
products are digested with Ncol and Smal to generate a 1 189 bp fragment having a 
blunt end (compatible with Pvull ends) and a 5' overhanging end compatible with the 
4/7III digested end of pUC19. The 1189 bp fragment is ligated into the digested 

10 pUC19 using standard techniques. 

The ligation products are used to transform the a-complementing E. coli strain 
DH5a (GibcoBRL) and the desired plasmid, termed pLACIQ (3695 bp), is isolated 
using standard techniques. DH5a cells containing pLACIQ produce white (LACT) 
colonies on LB plates containing ampicillin (100 ^g/mi) and X-Gal (0.004% w/v). 

15 b) The PCR Fidelity Assay 

The template used in the PCR fidelity assay is prepared as follows. 
pLACIQ is linearized by digestion with Dra II. A typical PCR reaction contains 5 or 
10 ng of linearized, gel-purified plasmid DNA. 

The following primers are used in the PCR fidelity assay to amplify the lacl q 
20 sequences of pLACIQ; both PCR primers used have Cla I cleavage sites at their 5' 

ends: 5 *- AGCTTATCGATGGCACTTTTCGGGGAAATGTGCG-3 ' (SEQ ID NO:50) 
and 5 ' - AGCTT ATCG AT AAGCGATGCCGGG AGCAG AC AAGC-3 ' (SEQ ID 
NO:51). The length of the resulting PCR product is 3,561 bp. 

The PCRs which employ the Tne DNA polymerases or UlTma DNA 
25 polymerase are performed using the following buffer [10 mM Tris-HCl (pH 8.8 at 

25°C), 10 mM KC1, 0.002% Tween 20, 1.5 mM MgCl 2 and 40 yM all four dNTPs]. 
PCRs which employ nTaq DNA polymerase are performed using the following buffer 
[10 mM Tris-HCl (pH 9.0 at 25°C), 50 mM KC1, 1.5 mM MgCl 2 , 0.1% Triton X-100 
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and 140 fiU all four dNTPs], All reactions contain 5-10 ng of template, 20 pmol of 
each primer and 2.5 units of polymerase (all polymerases are assayed using the assay 
described in Example 5 to ensure the same amount of activity is used for each enzyme 
tested). The following cycling conditions are used: denaturation for 10 sec at 94°C; 
5 annealing for 30 sec at 57°C and extension for 4 min at 72°C for 18 cycles. 

Following the PCR, the amplification products are PEG-precipitated as follows. 
The PCR amplification mixtures are frozen at -20°C and the mineral oil is removed by 
rinsing twice with cold chloroform. The mixtures are then thawed and 10 fig of 
glycogen and 1/2 volume of 30% PEG 3350/1.5 M NaCl is added. The mixture is 
10 allowed to stand for a minimum of 1 hr at 0-4°C. The mixture is then centrifuged in a 
microcentrifuge for 8 min and the supernatant is removed. The pellet is then rinsed 
with 75% ethanol and dried. The DNA is then digested with Clal and the digested 
DNA is purified by gel electrophoresis. The purified DNA is then ligated to 
recircularize the plasmid in a reaction containing less than or equal to 30 ng DNA. 
15 The resulting PCR-derived plasmids are transformed into competent E. coli 

DH5a and plated on LB Amp 100 X-Gal plates [LB plates containing 100 /xg/ml 
ampicillin and 0.004% X-Gal (w/v)]. After incubation overnight at 37°C, blue and 
white colonies are counted. The error rate (f) per bp is calculated using the rearranged 
equation published by Keohavong and Thiliy [Proc. Natl. Acad. Sci. USA 86:9253 
20 (1989): f = InF / d x b bp; where F is the fraction of white colonies: F = white 

(LACI + )/total colony number; 2 d is the number of DNA duplications: 2 d = output 
DN A/input DNA; and b is the effective target size of the (1080 bp) lad gene, which is 
349 bp according to Provost et al. [Mut. Research 288:133 (1993)]; there are 349 
phenotypically identified (by color screening) single-based substitutions (nonsense and 
25 mis-sense) at 179 codons (approximately 50% of the coding region) within the lad 

gene [Provost et al, supra]. Frameshift errors, which may occur at every position in 
the 1080 bp open reading frame of lad, are not taken into account. 

A religation control is prepared as follows. Fifty nanograms of Dra 
II-linearized, gel-purified pLACIQ DNA is religated, and an aliquot of the ligation 
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reaction is transformed into DH5a. After incubation overnight, the number of 
growing colonies (0.027%) showing a blue (LACf) phenotype on LB Amp X-Gal 
plates is measured to assess the rate of the formation of concatameric ligation products 
(with subsequent intramolecular recombination in E. coli that eliminates an additional 
5 origin of replication), which seems to be a very rare event. Restriction analysis of 

PCR-derived plasmids isolated from blue colonies is also performed to confirm that the 
LACI* phenotype originates in PCR-derived mutations of lad, but not in deleterious 
recombination events after transformation of the ligated DNA in DH5a. 

From the above, it is clear that the enzymes of the present invention provide 
10 thermostable DNA polymerase having novel features. In particular, these enzymes 
provide superior polymerases for use in PCR applications. In addition, these 
polymerases have higher affinities for ddNTPs which results in significant cost savings 
for users of chain termination sequencing protocols. 

Other modifications of the embodiments of the invention described above that 
15 are obvious to those of ordinary skill in the areas of molecular biology, biochemistry 

and related disciplines are intended to be within the scope of the accompanying claims. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: SLATER, MICHAEL R. 

HARTNETT, JAMES R. 
HUANG , FEN 
BOLCHAKOVA , ELENA 

(ii) TITLE OF INVENTION: MODIFIED THERMOPHILIC DNA POLYMERASES 
FROM THE RMOTOG A NEAPOLI TANA 

(iii) NUMBER OF SEQUENCES: 51 

(iv) CORRESPONDENCE ADDRESS : 

(A) ADDRESSEE : HAVERSTOCK, MEDLEN & CARROLL 

(B) STREET: 220 MONTGOMERY STREET, SUITE 2200 

(C) CITY: SAN FRANCISCO 

(D) STATE: CALIFORNIA 

(E) COUNTRY : USA 

(F) ZIP : 94104 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1,0, Version #1.25 

(vi) CURRENT APPLICATION DATA; 

(A) APPLICATION NUMBER: US 

(B) FILING DATE; 

(C) CLASSIFICATION: 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: CARROLL, PETER G, 

(B) REGISTRATION NUMBER: 32,837 

(C) REFERENCE /DOCKET NUMBER: PRMG- 01175 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (415) 705-8410 

(B) TELEFAX: (415) 397-8338 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2682 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/ KEY : CDS 

(B) LOCATION: 1..2682 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

ATG GCG AGA CTA TTT CTC TTT GAT GGC ACA GCC CTG GCC TAC AGG GCA 
Met Ala Arg Leu Phe Leu Phe Asp Gly Thr Ala Leu Ala Tyr Arg Ala 
15 10 15 
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TAT TAC GCC CTC GAC AGA TCC CTT TCC ACA TCC ACA GGA ATT CCA ACG 96 
Tyr Tyr Ala Leu Asp Arg Ser Leu Ser Thr Ser Thr Gly lie Pro Thr 
20 25 30 

AAC GCC GTC TAT GGC GTT GCC AGG ATG CTC GTT AAA TTC ATA AAG GAA 144 
Asn Ala Val Tyr Gly Val Ala Arg Met Leu Val Lys Phe lie Lys Glu 
35 40 45 

CAC ATT ATA CCC GAA AAG GAC TAC GCG GCT GTG GCC TTC GAC AAG AAG 192 
His He He Pro Glu Lys Asp Tyr Ala Ala Val Ala Phe Asp Lys Lvs 
50 55 60 

GCA GCG ACG TTC AGA CAC AAA CTG CTC GAA GCG TAC AAG GCG CAA AGG 240 
Ala Ala Thr Phe Arg His Lys Leu Leu Glu Ala Tyr Lys Ala Gin Arg 
65 70 75 80 

CCA AAG ACG CCG GAT CTT CTA GTT CAG CAG CTA CCT TAC ATC AAG CGG 288 
Pro Lys Thr Pro Asp Leu Leu Val Gin Gin Leu Pro Tyr He Lys Arg 
85 90 95 

CTG ATA GAA GCT CTT GGT TTC AAA GTG CTG GAG CTG GAA GGA TAC GAA 336 
Leu He Glu Ala Leu Gly Phe Lys Val Leu Glu Leu Glu Gly Tyr Glu 

100 105 110 

GCA GAC GAT ATC ATC GCC ACG CTT GCA GTC AAG GGC TGC ACG TTT TTT 384 
Ala Asp Asp He He Ala Thr Leu Ala Val Lys Gly Cys Thr Phe Phe 
H5 120 125 

GAT GAG ATT TTC ATA ATA ACC GGT GAC AAG GAT ATG CTT CAA CTT GTA 4 32 

Asp Glu He Phe He He Thr Gly Asp Lys Asp Met Leu Gin Leu Val 
130 135 140 

AAC GAG AAG ATA AAG GTC TGG AGA ATC GTC AAG GGG ATA TCG GAT CTT 4 80 

Asn Glu Lys He Lys Val Trp Arg He Val Lys Gly He Ser Asp Leu 
!45 150 155 160 

GAG CTT TAC GAT TCG AAA AAG GTG AAA GAA AGA TAC GGT GTG GAA CCA 528 
Glu Leu Tyr Asp Ser Lys Lys Val Lys Glu Arg Tyr Gly Val Glu Pro 
165 170 175 

CAT CAG ATA CCG GAT CTT CTA GCA CTG ACG GGA GAC GAG ATA GAC AAC 576 
His Gin He Pro Asp Leu Leu Ala Leu Thr Gly Asp Glu He Asp Asn 
180 185 190 

ATT CCC GGT GTA ACG GGA ATA GGT GAA AAG ACC GCT GTA CAG CTT CTC 624 
He Pro Gly Val Thr Gly He Gly Glu Lys Thr Ala Val Gin Leu Leu 
!95 200 205 

GGC AAG TAC AGA AAT CTT GAA GAC ATT CTG GAG CAT GCC CGT GAA CTC 672 
Gly Lys Tyr Arg Asn Leu Glu Asp He Leu Glu His Ala Arg Glu Leu 

210 215 220 

CCC CAG AGA GTG AGA AAG GCT CTC TTG AGA GAC AGG GAA GTT GCC ATC 72 0 

Pro Gin Arg Val Arg Lys Ala Leu Leu Arg Asp Arg Glu Val Ala lie 
225 230 235 240 

CTC AGT AAA AAA CTT GCA ACT CTG GTG ACG AAC GCA CCT GTT GAA GTG 76 8 

Leu Ser Lys Lys Leu Ala Thr Leu Val Thr Asn Ala Pro Val Glu Val 
245 2S0 255 
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GAC TGG GAA GAG ATG AAA TAC AGA GGA TAC GAC AAG AGA AAA CTA CTT 816 
Asp Trp Glu Glu Met Lys Tyr Arg Gly Tyr Asp Lys Arg Lys Leu Leu 
260 265 270 

CCG ATA TTG AAA GAA CTG GAG TTT GCT TCC ATC ATG AAG GAA CTT CAA 864 
Pro lie Leu Lys Glu Leu Glu Phe Ala Ser lie Met Lys Glu Leu Gin 
275 280 285 

CTG TAC GAA GAA GCA GAA CCC ACC GGA TAC GAA ATC GTG AAG GAT CAT 912 
Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu He Val Lys Asp His 
290 295 300 

AAG ACC TTC GAA GAT CTC ATC GAA AAG CTG AAG GAG GTT CCA TCT TTT 96 0 

Lys Thr Phe Glu Asp Leu He Glu Lys Leu Lys Glu Val Pro Ser Phe 
305 310 315 320 

GCC CTG GAC CTT GAA ACG TCC TCC CTT GAC CCG TTC AAC TGT GAG ATA 1008 
Ala Leu Asp Leu Glu Thr Ser Ser Leu Asp Pro Phe Asn Cys Glu He 
325 330 335 

GTC GGC ATC TCC GTG TCG TTC AAA CCG AAA ACA GCT TAT TAC ATT CCA 1056 
Val Gly He Ser Val Ser Phe Lys Pro Lys Thr Ala Tyr Tyr He Pro 
340 345 350 

CTT CAT CAC AGA AAC GCC CAG AAT CTT GAT GAA ACA CTG GTG CTG TCG 1104 
Leu His His Arg Asn Ala Gin Asn Leu Asp Glu Thr Leu Val Leu Ser 
355 360 365 

AAG TTG AAA GAG ATC CTC GAA GAC CCG TCT TCG AAG ATT GTG GGT CAG 1152 
Lys Leu Lys Glu He Leu Glu Asp Pro Ser Ser Lys He Val Gly Gin 
370 375 380 

AAC CTG AAG TAC GAC TAC AAG GTT CTT ATG GTA AAG GGT ATA TCG CCA 1200 
Asn Leu Lys Tyr Asp Tyr Lys Val Leu Met Val Lys Gly He Ser Pro 
385 390 395 400 

GTT TAT CCG CAT TTT GAC ACG ATG ATA GCT GCA TAT TTG CTG GAG CCA 1248 
Val Tyr Pro His Phe Asp Thr Met He Ala Ala Tyr Leu Leu Glu Pro 
405 410 415 

AAC GAG AAA AAA TTC AAT CTC GAA GAT CTG TCT TTG AAA TTT CTC GGA 12 96 

Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser Leu Lys Phe Leu Gly 
420 425 430 

TAC AAA ATG ACG TCT TAT CAG GAA CTG ATG TCG TTT TCC TCA CCA CTT 1344 
Tyr Lys Met Thr Ser Tyr Gin Glu Leu Met Ser Phe Ser Ser Pro Leu 
435 440 445 

TTT GGT TTC AGC TTT GCG GAT GTT CCG GTA GAC AAG GCT GCG AAC TAC 13 92 

Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp Lys Ala Ala Asn Tyr 
450 455 460 

TCC TGC GAG GAT GCA GAC ATC ACT TAT AGG CTC TAC AAG ATA CTC AGC 144 0 

Ser Cys Glu Asp Ala Asp He Thr Tyr Arg Leu Tyr Lys He Leu Ser 
465 470 475 480 

ATG AAG CTC CAT GAA GCG GAA CTT GAG AAC GTC TTC TAC AGG ATA GAG 14 8 8 

Met Lys Leu His Glu Ala Glu Leu Glu Asn Val Phe Tyr Arg He Glu 
485 490 495 
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ATG CCG CTT GTG AAC GTT CTT GCA CGC ATG GAA TTG AAC GGG GTG TAT 1536 
Met Pro Leu Val Asn Val Leu Ala Arg Met Glu Leu Asn Gly Val Tyr 
500 505 510 

GTG GAC ACA GAA TTC CTG AAA AAG CTC TCG GAG GAG TAC GGC AAA AAG 1584 
Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu Glu Tyr Gly Lys Lys 
515 520 525 

CTC GAG GAA CTG GCC GAA AAA ATC TAC CAG ATA GCA GGA GAG CCC TTC 16 32 

Leu Glu Glu Leu Ala Glu Lys He Tyr Gin He Ala Gly Glu Pro Phe 
530 535 540 

AAC ATC AAT TCT CCA AAA CAG GTT TCA AAG ATC CTT TTT GAG AAG CTG 1680 
Asn He Asn Ser Pro Lys Gin Val Ser Lys He Leu Phe Glu Lys Leu 
545 550 555 560 

GGA ATA AAA CCC CGT GGA AAA ACG ACA AAA ACA GGA GCG TAC TCT ACC 1728 
Gly He Lys Pro Arg Gly Lys Thr Thr Lys Thr Gly Ala Tyr Ser Thr 
565 570 575 

AGG ATA GAG GTG TTG GAA GAG ATA GCG AAT GAG CAC GAG ATA GTA CCC 1776 
Arg He Glu Val Leu Glu Glu He Ala Asn Glu His Glu He Val Pro 

580 585 590 

CTC ATT CTC GAG TAC AGA AAG ATC CAG AAA CTG AAA TCG ACC TAC ATA 1824 
Leu He Leu Glu Tyr Arg Lys He Gin Lys Leu Lys Ser Thr Tyr He 
595 600 60S 

GAC ACC CTT CCG AAA CTT GTG AAC CCG AAA ACC GGA AGA ATT CAT GCA 1872 
Asp Thr Leu Pro Lys Leu Val Asn Pro Lys Thr Gly Arg He His Ala 
610 615 620 

TCT TTC CAC CAG ACG GOT ACC GCC ACT GGC AGG TTG AGT AGO AGT GAT 1920 
Ser Phe His Gin Thr Gly Thr Ala Thr Gly Arg Leu Ser Ser Ser Asp 
625 630 635 640 

CCA AAT CTT CAG AAT CTT CCG ACA AAG AGC GAA GAG GGA AAA GAA ATT 1968 
Pro Asn Leu Gin Asn Leu Pro Thr Lys Ser Glu Glu Gly Lys Glu lie 
645 650 655 

AGA AAA GCG ATT GTG CCC CAG GAT CCA GAC TGG TGG ATC GTC AGT GCG 2016 
Arg Lys Ala He Val Pro Gin Asp Pro Asp Trp Trp He Val Ser Ala 
660 665 670 

GAT TAT TCC CAA ATA GAA CTC AGA ATC CTC GCT CAT CTC AGT GGT GAT 2064 
Asp Tyr Ser Gin He Glu Leu Arg He Leu Ala His Leu Ser Gly Asp 
675 680 685 

GAG AAC CTT GTG AAG GCC TTC GAG GAG GGC ATC GAT GTG CAC ACC TTG 2112 
Glu Asn Leu Val Lys Ala Phe Glu Glu Gly He Asp Val His Thr Leu 
690 695 700 

ACT GCC TCC AGG ATC TAC AAC GTA AAG CCA GAA GAA GTG AAC GAA GAA 216 0 

Thr Ala Ser Arg He Tyr Asn Val Lys Pro Glu Glu Val Asn Glu Glu 
7 05 710 715 720 

ATG CGA CGG GTT GGA AAG ATG GTG AAC TTC TCT ATA ATA TAC GGT GTC 22 08 

Met Arg Arg Val Gly Lys Met Val Asn Phe Ser He He Tyr Gly Val 
725 730 735 
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ACA CCG TAC GGT CTT TCT GTG AGA CTT GGA ATA CCG GTT AAA GAA GCA 22 56 

Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly lie Pro Val Lys Glu Ala 
740 745 7S0 

GAA AAG ATG ATT ATC AGC TAT TTC ACA CTG TAT CCA AAG GTG CGA AGC 23 04 

Glu Lys Met lie He Ser Tyr Phe Thr Leu Tyr Pro Lys Val Arg Ser 
755 760 765 

TAC ATC GAG CAG GTT GTT GCA GAG GCA AAA GAG AAG GGC TAC GTC AGG 23 52 

Tyr He Gin Gin Val Val Ala Glu Ala Lys Glu Lys Gly Tyr Val Arg 
770 775 780 

ACT CTC TTT GGA AGA AAA AGA GAT ATT CCC CAG CTC ATG GCA AGG GAC 2400 
Thr Leu Phe Gly Arg Lys Arg Asp He Pro Gin Leu Met Ala Arg Asp 
785 790 795 800 

AAG AAC ACC CAG TCC GAA GGC GAA AGA ATC GCA ATA AAC ACC CCC ATT 244 8 

Lys Asn Thr Gin Ser Glu Gly Glu Arg He Ala He Asn Thr Pro He 
805 810 815 

CAG GGA ACG GCG GCA GAT ATA ATA AAA TTG GCT ATG ATA GAT ATA GAC 24 96 

Gin Gly Thr Ala Ala Asp He He Lys Leu Ala Met He Asp He Asp 
820 825 830 

GAG GAG CTG AGA AAA AGA AAC ATG AAA TCC AGA ATG ATC ATT CAG GTT 2544 
Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg Met He He Gin Val 
835 840 845 

CAT GAC GAA CTG GTC TTC GAG GTT CCC GAT GAG GAA AAA GAA GAA CTA 2592 
His Asp Glu Leu Val Phe Glu Val Pro Asp Glu Glu Lys Glu Glu Leu 
850 855 860 

GTT GAT CTG GTG AAG AAC AAA ATG ACA AAT GTG GTG AAA CTC TCT GTG 2640 
Val Asp Leu Val Lys Asn Lys Met Thr Asn Val Val Lys Leu Ser Val 
865 870 875 880 

CCT CTT GAG GTT GAC ATA AGC ATC GGA AAA AGC TGG TCT TG 2682 
Pro Leu Glu Val Asp He Ser He Gly Lys Ser Trp Ser 
885 890 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 893 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Ala Arg Leu Phe Leu Phe Asp Gly Thr Ala Leu Ala Tyr Arg Ala 
15 10 15 

Tyr Tyr Ala Leu Asp Arg Ser Leu Ser Thr Ser Thr Gly He Pro Thr 
20 25 30 

Asn Ala Val Tyr Gly Val Ala Arg Met Leu Val Lys Phe He Lys Glu 
35 40 45 

His He He Pro Glu Lys Asp Tyr Ala Ala Val Ala Phe Asp Lys Lys 
50 55 60 
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Ala Ala Thr Phe Arg His Lys Leu Leu Glu Ala Tyr Lys Ala Gin Arg 
65 70 75 80 

Pro Lys Thr Pro Asp Leu Leu Val Gin Gin Leu Pro Tyr lie Lys Arg 
85 90 95 

Leu lie Glu Ala Leu Gly Phe Lys Val Leu Glu Leu Glu Gly Tyr Glu 
100 105 no 

Ala Asp Asp He He Ala Thr Leu Ala Val Lys Gly Cys Thr Phe Phe 
115 120 125 

Asp Glu He Phe He He Thr Gly Asp Lys Asp Met Leu Gin Leu Val 
130 135 140 

Asn Glu Lys He Lys Val Trp Arg He Val Lys Gly He Ser Asp Leu 
14 5 150 155 160 

Glu Leu Tyr Asp Ser Lys Lys Val Lys Glu Arg Tyr Gly Val Glu Pro 
165 170 175 

His Gin He Pro Asp Leu Leu Ala Leu Thr Gly Asp Glu He Asp Asn 

180 185 190 

He Pro Gly Val Thr Gly He Gly Glu Lys Thr Ala Val Gin Leu Leu 
195 200 205 

Gly Lys Tyr Arg Asn Leu Glu Asp He Leu Glu His Ala Arg Glu Leu 
210 215 220 

Pro Gin Arg Val Arg Lys Ala Leu Leu Arg Asp Arg Glu Val Ala He 
225 230 235 240 

Leu Ser Lys Lys Leu Ala Thr Leu Val Thr Asn Ala Pro Val Glu Val 
245 250 255 

Asp Trp Glu Glu Met Lys Tyr Arg Gly Tyr Asp Lys Arg Lys Leu Leu 
260 265 270 

Pro He Leu Lys Glu Leu Glu Phe Ala Ser He Met Lys Glu Leu Gin 
275 280 285 

Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu He Val Lys Asp His 
290 295 300 

Lys Thr Phe Glu Asp Leu He Glu Lys Leu Lys Glu Val Pro Ser Phe 
305 310 315 320 

Ala Leu Asp Leu Glu Thr Ser Ser Leu Asp Pro Phe Asn Cys Glu He 

325 330 335 

Val Gly He Ser Val Ser Phe Lys Pro Lys Thr Ala Tyr Tyr He Pro 
340 345 350 

Leu His His Arg Asn Ala Gin Asn Leu Asp Glu Thr Leu Val Leu Ser 

355 360 365 

Lys Leu Lys Glu He Leu Glu Asp Pro Ser Ser Lys He Val Gly Gin 

370 375 380 



Asn Leu Lys Tyr Asp Tyr Lys Val Leu Met Val Lys Gly He Ser Pro 
385 390 395 400 
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Val Tyr Pro His 
Asn Glu 
Tyr Lys 



Lys Lys 
420 



Phe Gly 
450 

Ser Cys 
465 



Met Thr 
435 

Phe Ser 
Glu Asp 



Phe Asp Thr Met He Ala Ala Tyr Leu Leu Glu Pro 
405 410 415 

Phe Asn Leu Glu Asp Leu Ser Leu Lys Phe Leu Gly 
425 430 

Ser Tyr Gin Glu Leu Met Ser Phe Ser Ser Pro Leu 
440 445 

Phe Ala Asp Val Pro Val Asp Lys Ala Ala Asn Tyr 
455 460 

Ala Asp He Thr Tyr Arg Leu Tyr Lys He Leu Ser 
470 475 480 



Met Lys Leu His Glu Ala Glu Leu Glu Asn Val Phe Tyr Arg He Glu 
485 490 495 



Met Pro 



Val Asp 



Leu Val 
500 

Thr Glu 

515 



Asn Val Leu Ala Arg Met Glu Leu Asn Gly Val Tyr 
505 510 

Phe Leu Lys Lys Leu Ser Glu Glu Tyr Gly Lys Lys 

520 525 



Leu Glu Glu Leu Ala Glu Lys He Tyr Gin He Ala Gly Glu Pro Phe 
530 535 540 



Asn He 
545 



Gly He Lys Pro 
Arg He 
Leu He 



Asn Ser Pro Lys Gin Val Ser Lys He Leu Phe Glu Lys Leu 
S50 555 560 

Arg Gly Lys Thr Thr Lys Thr Gly Ala Tyr Ser Thr 
565 570 575 

Leu Glu Glu He Ala Asn Glu His Glu He Val Pro 
585 590 

Leu Glu Tyr Arg Lys He Gin Lys Leu Lys Ser Thr Tyr He 
S95 600 60S 



Glu Val 
580 



Asp Thr 

610 

Ser Phe 
625 

Pro Asn 
Arg Lys 
Asp Tyr 



Glu Asn 
690 



Leu Pro Lys Leu Val Asn Pro Lys Thr Gly Arg He His Ala 
615 620 

His Gin Thr Gly Thr Ala Thr Gly Arg Leu Ser Ser Ser Asp 
630 635 640 

Leu Gin Asn Leu Pro Thr Lys Ser Glu Glu Gly Lys Glu He 
645 650 655 

Ala He Val Pro Gin Asp Pro Asp Trp Trp lie Val Ser Ala 
660 665 670 

Ser Gin He Glu Leu Arg He Leu Ala His Leu Ser Gly Asp 
675 680 685 

Leu Val Lys Ala Phe Glu Glu Gly He Asp Val His Thr Leu 
695 700 



Thr Ala Ser Arg 
705 

Met Arg Arg Val 



He Tyr Asn Val Lys Pro Glu Glu Val Asn Glu Glu 

710 715 720 

Gly Lys Met Val Asn Phe Ser He lie Tyr Gly Val 

725 730 735 
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Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly lie Pro Val Lys Glu Ala 
740 745 750 

Glu Lys Met He lie Ser Tyr Phe Thr Leu Tyr Pro Lys Val Arg Ser 
755 760 765 

Tvr He Gin Gin Val Val Ala Glu Ala Lys Glu Lys Gly Tyr Val Arg 
770 775 780 

Thr Leu Phe Gly Arg Lys Arg Asp He Pro Gin Leu Met Ala Arg Asp 
785 790 795 800 

Lys Asn Thr Gin Ser Glu Gly Glu Arg He Ala He Asn Thr Pro He 
805 810 815 

Gin Gly Thr Ala Ala Asp He He Lys Leu Ala Met He Asp He Asp 

825 830 



820 



Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg Met He He Gin Val 
835 840 845 

His Asp Glu Leu Val Phe Glu Val Pro Asp Glu Glu Lys Glu Glu Leu 
850 855 860 

Val Asp Leu Val Lys Asn Lys Met Thr Asn Val Val Lys Leu Ser Val 
86S 870 875 880 

Pro Leu Glu Val Asp He Ser He Gly Lys Ser Trp Ser 
885 890 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 
CATGGCGAGA CTATTTCTCT TTGATGGCAC AGCCCTGGCC TACA 
(2) INFORMATION FOR SEQ ID NO: 4: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 
AGGCCAGGGC TGTGCCATCA AAGAGAAATA GTCTCGC 
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(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 9 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 
AGGCCAGGGC TGTGCCATCA AAGAGAAATA GTCTCGCCA 39 
(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
TATGGCGAGA CTATTTCTCT TTGATGGCAC AGCCCTGGCC TACA 44 
(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1833 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY ; linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY; CDS 

(B) LOCATION: 1..1833 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7; 

ATG AAG GAA CTT CAA CTG TAC GAA GAA GCA GAA CCC ACC GGA TAC GAA 48 
Met Lys Glu Leu Gin Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 
15 10 15 

ATC GTG AAG GAT CAT AAG ACC TTC GAA GAT CTC ATC GAA AAG CTG AAG 96 
lie Val Lys Asp His Lys Thr Phe Glu Asp Leu lie Glu Lys Leu Lys 

20 25 30 

GAG GTT CCA TCT TTT GCC CTG GAC CTT GAA ACG TCC TCC CTT GAC CCG 144 
Glu Val Pro Ser Phe Ala Leu Asp Leu Glu Thr Ser Ser Leu Asp Pro 
35 40 45 

TTC AAC TGT GAG ATA GTC GGC ATC TCC GTG TCG TTC AAA CCG AAA ACA 192 
Phe Asn Cys Glu lie Val Gly lie Ser Val Ser Phe Lys Pro Lys Thr 

50 55 60 
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GCT TAT TAC ATT CCA CTT CAT CAC AGA AAC GCC CAG AAT CTT GAT GAA 240 
Ala Tyr Tyr lie Pro Leu His His Arg Asn Ala Gin Asn Leu Asp Glu 
65 70 75 80 

ACA CTG GTG CTG TCG AAG TTG AAA GAG ATC CTC GAA GAC CCG TCT TCG 2 88 

Thr Leu Val Leu Ser Lys Leu Lys Glu lie Leu Glu Asp Pro Ser Ser 

85 90 95 

AAG ATT GTG GGT CAG AAC CTG AAG TAC GAC TAC AAG GTT CTT ATG GTA 336 
Lys lie Val Gly Gin Asn Leu Lys Tyr Asp Tyr Lys Val Leu Met Val 
100 105 110 

AAG GGT ATA TCG CCA GTT TAT CCG CAT TTT GAC ACG ATG ATA GCT GCA 384 
Lys Gly lie Ser Pro Val Tyr Pro His Phe Asp Thr Met lie Ala Ala 
115 120 125 

TAT TTG CTG GAG CCA AAC GAG AAA AAA TTC AAT CTC GAA GAT CTG TCT 43 2 

Tyr Leu Leu Glu Pro Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 
130 135 140 

TTG AAA TTT CTC GGA TAC AAA ATG ACG TCT TAT CAG GAA CTG ATG TCG 480 
Leu Lys Phe Leu Gly Tyr Lys Met Thr Ser Tyr Gin Glu Leu Met Ser 
145 150 155 160 

TTT TCC TCA CCA CTT TTT GGT TTC AGC TTT GCG GAT GTT CCG GTA GAC 52 8 

Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 
165 170 175 

AAG GCT GCG AAC TAC TCC TGC GAG GAT GCA GAC ATC ACT TAT AGG CTC 576 
Lys Ala Ala Asn Tyr Ser Cys Glu Asp Ala Asp He Thr Tyr Arg Leu 
180 185 190 

TAC AAG ATA CTC AGC ATG AAG CTC CAT GAA GCG GAA CTT GAG AAC GTC 624 
Tyr Lys He Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val 
195 200 205 

TTC TAC AGG ATA GAG ATG CCG CTT GTG AAC GTT CTT GCA CGC ATG GAA 672 
Phe Tyr Arg He Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu 
210 215 220 

TTG AAC GGG GTG TAT GTG GAC ACA GAA TTC CTG AAA AAG CTC TCG GAG 72 0 

Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu 
225 230 235 240 

GAG TAC GGC AAA AAG CTC GAG GAA CTG GCC GAA AAA ATC TAC CAG ATA 768 
Glu Tyr Gly Lys Lys Leu Glu Glu Leu Ala Glu Lys lie Tyr Gin He 
245 250 255 

GCA GGA GAG CCC TTC AAC ATC AAT TCT CCA AAA CAG GTT TCA AAG ATC 816 
Ala Gly Glu Pro Phe Asn He Asn Ser Pro Lys Gin Val Ser Lys He 
260 265 270 

CTT TTT GAG AAG CTG GGA ATA AAA CCC CGT GGA AAA ACG ACA AAA ACA 864 
Leu Phe Glu Lys Leu Gly He Lys Pro Arg Gly Lys Thr Thr Lys Thr 
275 280 285 

GGA GCG TAC TCT ACC AGG ATA GAG GTG TTG GAA GAG ATA GCG AAT GAG 912 
Gly Ala Tyr Ser Thr Arg He Glu Val Leu Glu Glu He Ala Asn Glu 

290 295 300 



12 



CAC GAG ATA GTA CCC CTC ATT CTC GAG TAC AGA AAG ATC CAG AAA CTG 96 0 

His Glu lie Val Pro Leu He Leu Glu Tyr Arg Lys lie Gin Lys Leu 
305 310 315 320 

AAA TCG ACC TAC ATA GAC ACC CTT CCG AAA CTT GTG AAC CCG AAA ACC 1008 
Lys Ser Thr Tyr He Asp Thr Leu Pro Lys Leu Val Asn Pro Lys Thr 
325 330 335 

GGA AGA ATT CAT GCA TCT TTC CAC CAG ACG GGT ACC GCC ACT GGC AGG 1056 
Gly Arg He His Ala Ser Phe His Gin Thr Gly Thr Ala Thr Gly Arg 
340 345 350 

TTG AGT AGC AGT GAT CCA AAT CTT CAG AAT CTT CCG ACA AAG AGC GAA 1104 
Leu Ser Ser Ser Asp Pro Asn Leu Gin Asn Leu Pro Thr Lys Ser Glu 
355 360 365 

GAG GGA AAA GAA ATT AGA AAA GCG ATT GTG CCC CAG GAT CCA GAC TGG 1152 
Glu Gly Lys Glu He Arg Lys Ala He Val Pro Gin Asp Pro Asp Trp 
370 375 380 

TGG ATC GTC AGT GCG GAT TAT TCC CAA ATA GAA CTC AGA ATC CTC GCT 1200 
Trp He Val Ser Ala Asp Tyr Ser Gin He Glu Leu Arg He Leu Ala 
385 390 395 400 

CAT CTC AGT GGT GAT GAG AAC CTT GTG AAG GCC TTC GAG GAG GGC ATC 1248 
His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly He 
405 410 415 

GAT GTG CAC ACC TTG ACT GCC TCC AGG ATC TAC AAC GTA AAG CCA GAA 12 96 

Asp Val His Thr Leu Thr Ala Ser Arg lie Tyr Asn Val Lys Pro Glu 
420 425 430 

GAA GTG AAC GAA GAA ATG CGA CGG GTT GGA AAG ATG GTG AAC TTC TCT 1344 
Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 
435 440 445 

ATA ATA TAC GGT GTC ACA CCG TAC GGT CTT TCT GTG AGA CTT GGA ATA 1392 
He He Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly He 
450 455 460 

CCG GTT AAA GAA GCA GAA AAG ATG ATT ATC AGC TAT TTC ACA CTG TAT 1440 
Pro Val Lys Glu Ala Glu Lys Met He He Ser Tyr Phe Thr Leu Tyr 
465 470 475 480 

CCA AAG GTG CGA AGC TAC ATC CAG CAG GTT GTT GCA GAG GCA AAA GAG 1488 
Pro Lys Val Arg Ser Tyr He Gin Gin Val Val Ala Glu Ala Lys Glu 
485 490 495 

AAG GGC TAC GTC AGG ACT CTC TTT GGA AGA AAA AGA GAT ATT CCC CAG 1536 
Lys Gly Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg Asp He Pro Gin 
500 505 510 

CTC ATG GCA AGG GAC AAG AAC ACC CAG TCC GAA GGC GAA AGA ATC GCA 1584 
Leu Met Ala Arg Asp Lys Asn Thr Gin Ser Glu Gly Glu Arg He Ala 
515 520 525 

ATA AAC ACC CCC ATT CAG GGA ACG GCG GCA GAT ATA ATA AAA TTG GCT 16 32 

He Asn Thr Pro He Gin Gly Thr Ala Ala Asp He He Lys Leu Ala 
530 535 540 
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ATG ATA GAT ATA GAC GAG GAG CTG AG A AAA AGA AAC ATG AAA TCC AG A 168 0 

Met lie Asp lie Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 
545 550 555 560 

ATG ATC ATT CAG GTT CAT GAC GAA CTG GTC TTC GAG GTT CCC GAT GAG 1728 
Met lie lie Gin Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu 
565 570 575 

GAA AAA GAA GAA CTA GTT GAT CTG GTG AAG AAC AAA ATG ACA AAT GTG 17 76 

Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr Asn Val 
580 585 590 

GTG AAA CTC TCT GTG CCT CTT GAG GTT GAC ATA AGC ATC GGA AAA AGC 1824 
Val Lys Leu Ser Val Pro Leu Glu Val Asp He Ser He Gly Lys Ser 
595 600 605 

TGG TCT TG 1833 
Trp Ser 
610 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 610 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Lys Glu Leu Gin Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glru 
15 10 15 

He Val Lys Asp His Lys Thr Phe Glu Asp Leu He Glu Lys Leu Lys 
20 25 30 

Glu Val Pro Ser Phe Ala Leu Asp Leu Glu Thr Ser Ser Leu Asp Pro 
35 40 45 

Phe Asn Cys Glu He Val Gly He Ser Val Ser Phe Lys Pro Lys Thr 
50 55 60 

Ala Tyr Tyr He Pro Leu His His Arg Asn Ala Gin Asn Leu Asp Glu 
65 70 75 80 

Thr Leu Val Leu Ser Lys Leu Lys Glu He Leu Glu Asp Pro Ser Ser 
85 90 95 

Lys He Val Gly Gin Asn Leu Lys Tyr Asp Tyr Lys Val Leu Met Val 
100 105 110 

Lys Gly He Ser Pro Val Tyr Pro His Phe Asp Thr Met He Ala Ala 
115 120 125 

Tyr Leu Leu Glu Pro Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 
130 135 140 

Leu Lys Phe Leu Gly Tyr Lys Met Thr Ser Tyr Gin Glu Leu Met Ser 
145 150 155 160 
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Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 
165 170 175 

Lys Ala Ala Asn Tyr Ser Cys Glu Asp Ala Asp lie Thr Tyr Arg Leu 
180 185 190 

Tyr Lys He Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val 
195 200 205 

Phe Tyr Arg XI e Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu 
210 215 220 

Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu 
225 230 235 240 

Glu Tyr Gly Lys Lys Leu Glu Glu Leu Ala Glu Lys He Tyr Gin He 
245 250 255 

Ala Gly Glu Pro Phe Asn He Asn Ser Pro Lys Gin Val Ser Lys lie 
260 265 270 

Leu Phe Glu Lys Leu Gly He Lys Pro Arg Gly Lys Thr Thr Lys Thr 

275 280 285 

Gly Ala Tyr Ser Thr Arg He Glu Val Leu Glu Glu He Ala Asn Glu 
290 295 300 

His Glu He Val Pro Leu He Leu Glu Tyr Arg Lys He Gin Lys Leu 
305 310 315 320 

Lys Ser Thr Tyr He Asp Thr Leu Pro Lys Leu Val Asn Pro Lys Thr 
325 330 335 

Gly Arg He His Ala Ser Phe His Gin Thr Gly Thr Ala Thr Gly Arg 
340 345 350 

Leu Ser Ser Ser Asp Pro Asn Leu Gin Asn Leu Pro Thr Lys Ser Glu 
355 360 365 

Glu Gly Lys Glu He Arg Lys Ala He Val Pro Gin Asp Pro Asp Trp 
370 375 380 

Trp He Val Ser Ala Asp Tyr Ser Gin He Glu Leu Arg He Leu Ala 
385 390 395 400 

His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly He 
405 410 415 

Asp Val His Thr Leu Thr Ala Ser Arg He Tyr Asn Val Lys Pro Glu 
420 425 430 

Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 
435 440 445 

He He Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly He 
450 455 460 

Pro Val Lys Glu Ala Glu Lys Met He He Ser Tyr Phe Thr Leu Tyr 
465 470 475 480 

Pro Lys Val Arg Ser Tyr He Gin Gin Val Val Ala Glu Ala Lys Glu 
485 490 495 
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Lys Gly Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg Asp lie Pro Gin 
500 505 510 



Leu Met Ala Arg Asp Lys Asn Thr Gin Ser Glu Gly Glu Arg He Ala 
515 520 525 

He Asn Thr Pro He Gin Gly Thr Ala Ala Asp He He Lys Leu Ala 
530 535 540 

Met lie Asp He Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 
545 550 555 560 

Met lie lie Gin Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu 
565 570 575 

Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr Asn Val 
580 585 590 

Val Lys Leu Ser Val Pro Leu Glu Val Asp lie Ser lie Gly Lys Ser 
595 600 605 

Trp Ser 

610 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic} 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
ATCGAAAAGC TGACCATGGT TCCATCTTTT G 31 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1737 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(ix) FEATURE : 

(A) NAME/ KEY : CDS 

(B) LOCATION : 1..173 7 

<xi) SEQUENCE DESCRIPTION : SEQ ID NO: 10: 

ATG GTT CCA TCT TTT GCC CTG GAC CTT GAA ACG TCC TCC CTT GAC CCG 4 8 

Met Val Pro Ser Phe Ala Leu Asp Leu Glu Thr Ser Ser Leu Asp Pro 
15 10 15 

TTC AAC TGT GAG ATA GTC GGC ATC TCC GTG TCG TTC AAA CCG AAA ACA 96 
Phe Asn Cys Glu lie Val Gly He Ser Val Ser Phe Lys Pro Lys Thr 
20 25 30 
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GCT TAT TAC ATT CCA CTT CAT CAC AGA AAC GCC CAG AAT CTT GAT GAA 144 
Ala Tyr Tyr He Pro Leu His His Arg Asn Ala Gin Asn Leu Asp Glu 
35 40 45 

ACA CTG GTG CTG TCG AAG TTG AAA GAG ATC CTC GAA GAC CCG TCT TCG 192 
Thr Leu Val Leu Ser Lys Leu Lys Glu He Leu Glu Asp Pro Ser Ser 
50 55 60 

AAG ATT GTG GGT CAG AAC CTG AAG TAC GAC TAC AAG GTT CTT ATG GTA 24 0 

Lys He Val Gly Gin Asn Leu Lys Tyr Asp Tyr Lys Val Leu Met Val 
65 70 75 80 

AAG GGT ATA TCG CCA GTT TAT CCG CAT TTT GAC ACG ATG ATA GCT GCA 288 
Lys Gly He Ser Pro Val Tyr Pro His Phe Asp Thr Met He Ala Ala 
85 90 95 

TAT TTG CTG GAG CCA AAC GAG AAA AAA TTC AAT CTC GAA GAT CTG TCT 3 36 

Tyr Leu Leu Glu Pro Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 
100 105 110 

TTG AAA TTT CTC GGA TAC AAA ATG ACG TCT TAT CAG GAA CTG ATG TCG 3 84 

Leu Lys Phe Leu Gly Tyr Lys Met Thr Ser Tyr Gin Glu Leu Met Ser 
115 120 125 

TTT TCC TCA CCA CTT TTT GGT TTC AGC TTT GCG GAT GTT CCG GTA GAC 432 
Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 
130 135 140 

AAG GCT GCG AAC TAC TCC TGC GAG GAT GCA GAC ATC ACT TAT AGG CTC 480 
Lys Ala Ala Asn Tyr Ser Cys Glu Asp Ala Asp He Thr Tyr Arg Leu 
145 150 155 160 

TAC AAG ATA CTC AGC ATG AAG CTC CAT GAA GCG GAA CTT GAG AAC GTC 52 8 

Tyr Lys He Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val 
165 170 175 

TTC TAC AGG ATA GAG ATG CCG CTT GTG AAC GTT CTT GCA CGC ATG GAA 576 
Phe Tyr Arg He Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu 
180 185 190 

TTG AAC GGG GTG TAT GTG GAC ACA GAA TTC CTG AAA AAG CTC TCG GAG 624 
Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu 
195 200 205 

GAG TAC GGC AAA AAG CTC GAG GAA CTG GCC GAA AAA ATC TAC CAG ATA 672 
Glu Tyr Gly Lys Lys Leu Glu Glu Leu Ala Glu Lys He Tyr Gin He 
210 215 220 

GCA GGA GAG CCC TTC AAC ATC AAT TCT CCA AAA CAG GTT TCA AAG ATC 720 
Ala Gly Glu Pro Phe Asn He Asn Ser Pro Lys Gin Val Ser Lys He 
225 230 235 240 

CTT TTT GAG AAG CTG GGA ATA AAA CCC CGT GGA AAA ACG ACA AAA ACA 76 8 

Leu Phe Glu Lys Leu Gly He Lys Pro Arg Gly Lys Thr Thr Lys Thr 
245 250 255 

GGA GCG TAC TCT ACC AGG ATA GAG GTG TTG GAA GAG ATA GCG AAT GAG 816 
Gly Ala Tyr Ser Thr Arg He Glu Val Leu Glu Glu He Ala Asn Glu 

260 265 270 
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CAC GAG ATA GTA CCC CTC ATT CTC GAG TAC AGA AAG ATC GAG AAA CTG 864 
His Glu lie Val Pro Leu lie Leu Glu Tyr Arg Lys He Gin Lys Leu 
275 280 285 

AAA TCG ACC TAC ATA GAC ACC CTT CCG AAA CTT GTG AAC CCG AAA ACC 912 
Lys Ser Thr Tyr He Asp Thr Leu Pro Lys Leu Val Asn Pro Lys Thr 
290 295 300 

GGA AGA ATT CAT GCA TCT TTC CAC CAG ACG GGT ACC GCC ACT GGC AGG 960 
Gly Arg He His Ala Ser Phe His Gin Thr Gly Thr Ala Thr Gly Arg 
305 310 315 320 

TTG AGT AGC AGT GAT CCA AAT CTT CAG AAT CTT CCG ACA AAG AGC GAA 1008 
Leu Ser Ser Ser Asp Pro Asn Leu Gin Asn Leu Pro Thr Lys Ser Glu 
325 330 335 

GAG GGA AAA GAA ATT AGA AAA GCG ATT GTG CCC CAG GAT CCA GAC TGG 1056 
Glu Gly Lys Glu He Arg Lys Ala He Val Pro Gin Asp Pro Asp Trp 
340 345 350 

TGG ATC GTC AGT GCG GAT TAT TCC CAA ATA GAA CTC AGA ATC CTC GCT 1104 
Trp He Val Ser Ala Asp Tyr Ser Gin He Glu Leu Arg He Leu Ala 

355 360 365 

CAT CTC AGT GGT GAT GAG AAC CTT GTG AAG GCC TTC GAG GAG GGC ATC 1152 
His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly He 
370 375 380 

GAT GTG CAC ACC TTG ACT GCC TCC AGG ATC TAC AAC GTA AAG CCA GAA 1200 
Asp Val His Thr Leu Thr Ala Ser Arg He Tyr Asn Val Lys Pro Glu 
385 390 395 400 

GAA GTG AAC GAA GAA ATG CGA CGG GTT GGA AAG ATG GTG AAC TTC TCT 1248 
Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 
405 410 415 

ATA ATA TAC GGT GTC ACA CCG TAC GGT CTT TCT GTG AGA CTT GGA ATA 1296 
He He Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly He 
420 425 430 

CCG GTT AAA GAA GCA GAA AAG ATG ATT ATC AGC TAT TTC ACA CTG TAT 1344 
Pro Val Lys Glu Ala Glu Lys Met He He Ser Tyr Phe Thr Leu Tyr 
435 440 445 

CCA AAG GTG CGA AGC TAC ATC CAG CAG GTT GTT GCA GAG GCA AAA GAG 13 92 

Pro Lys Val Arg Ser Tyr He Gin Gin Val Val Ala Glu Ala Lys Glu 
450 455 460 

AAG GGC TAC GTC AGG ACT CTC TTT GGA AGA AAA AGA GAT ATT CCC CAG 1440 
Lys Gly Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg Asp He Pro Gin 
465 470 475 480 

CTC ATG GCA AGG GAC AAG AAC ACC CAG TCC GAA GGC GAA AGA ATC GCA I48 6 

Leu Met Ala Arg Asp Lys Asn Thr Gin Ser Glu Gly Glu Arg He Ala 
485 490 495 

ATA AAC ACC CCC ATT CAG GGA ACG GCG GCA GAT ATA ATA AAA TTG GCT 153 6 

He Asn Thr Pro He Gin Gly Thr Ala Ala Asp He He Lys Leu Ala 
500 505 510 
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ATG ATA GAT ATA GAC GAG GAG CTG AGA AAA AGA AAC ATG AAA TCC AGA 1584 

Met He Asp He Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 

515 520 525 

ATG ATC ATT CAG GTT CAT GAC GAA CTG GTC TTC GAG GTT CCC GAT GAG 16 32 

Met He He Gin Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu 

530 535 540 

GAA AAA GAA GAA CTA GTT GAT CTG GTG AAG AAC AAA ATG ACA AAT GTG 1680 

Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr Asn Val 

545 550 555 560 

GTG AAA CTC TCT GTG CCT CTT GAG GTT GAC ATA AGC ATC G<3A AAA AGC 1728 

Val Lys Leu Ser Val Pro Leu Glu Val Asp He Ser He Gly Lys Ser 
565 570 575 



TGG TCT TG 
Trp Ser 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 578 amino acids 

(B) TYPE : amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Val Pro Ser Phe Ala Leu Asp Leu Glu Thr Ser Ser Leu Asp Pro 
15 10 is 

Phe Asn Cys Glu He Val Gly He Ser Val Ser Phe Lys Pro Lys Thr 
20 25 30 

Ala Tyr Tyr He Pro Leu His His Arg Asn Ala Gin Asn Leu Asp Glu 
35 40 45 

Thr Leu Val Leu Ser Lys Leu Lys Glu He Leu Glu Asp Pro Ser Ser 
50 55 60 

Lys He Val Gly Gin Asn Leu Lys Tyr Asp Tyr Lys Val Leu Met Val 
65 70 75 80 

Lys Gly He Ser Pro Val Tyr Pro His Phe Asp Thr Met He Ala Ala 
85 90 95 

Tyr Leu Leu Glu Pro Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 
100 105 110 

Leu Lys Phe Leu Gly Tyr Lys Met Thr Ser Tyr Gin Glu Leu Met Ser 
115 120 125 

Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 
130 135 140 

Lys Ala Ala Asn Tyr Ser Cys Glu Asp Ala Asp He Thr Tyr Arg Leu 
145 150 155 160 

Tyr Lys He Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val 
165 170 175 



1737 
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Phe Tyr Arg He Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu 
180 185 190 

Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu 
195 200 205 

Glu Tyr Gly Lys Lys Leu Glu Glu Leu Ala Glu Lys lie Tyr Gin He 
210 215 220 

Ala Gly Glu Pro Phe Asn He Asn Ser Pro Lys Gin Val Ser Lys He 
225 230 235 240 



Leu Phe Glu 
Gly Ala Tyr 



Lys Leu Gly He Lys Pro Arg Gly Lys Thr Thr Lys Thr 
245 250 255 

Ser Thr Arg He Glu Val Leu Glu Glu He Ala Asn Glu 
260 265 270 



His Glu He Val Pro Leu He Leu Glu Tyr Arg Lys He Gin Lys Leu 
275 280 285 

Lys Ser Thr Tyr He Asp Thr Leu Pro Lys Leu Val Asn Pro Lys Thr 

290 295 300 

Gly Arg He His Ala Ser Phe His Gin Thr Gly Thr Ala Thr Gly Arg 
305 310 315 320 

Leu Ser Ser Ser Asp Pro Asn Leu Gin Asn Leu Pro Thr Lys Ser Glu 
325 330 335 

Glu Gly Lys Glu He Arg Lys Ala He Val Pro Gin Asp Pro Asp Trp 
340 345 350 

Trp He Val Ser Ala Asp Tyr Ser Gin lie Glu Leu Arg lie Leu Ala 
355 360 365 

His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly He 
370 37S 380 

Asp Val His Thr Leu Thr Ala Ser Arg He Tyr Asn Val Lys Pro Glu 
385 390 395 400 

Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 
405 410 415 

He He Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly He 
420 425 430 

Pro Val Lys Glu Ala Glu Lys Met He He Ser Tyr Phe Thr Leu Tyr 
435 440 445 

Pro Lys Val Arg Ser Tyr He Gin Gin Val Val Ala Glu Ala Lys Glu 
450 455 460 

Lys Gly Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg Asp He Pro Gin 
465 470 475 480 



Leu Met Ala 



He Asn Thr 



Arg Asp Lys Asn Thr Gin Ser Glu Gly Glu Arg 
485 490 



He Ala 

495 



Pro He Gin Gly Thr Ala Ala Asp He He Lys Leu Ala 
500 505 510 



- 129 



Met lie Asp lie Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 
515 520 525 

Met He He Gin Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu 
530 535 540 

Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr Asn Val 
545 550 555 560 

Val Lys Leu Ser Val Pro Leu Glu Val Asp He Ser He Gly Lys Ser 
565 570 575 

Trp Ser 

(2) INFORMATION FOR SEQ ID NO: 12; 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
TGCCGTACAC CTCCGAGAGC 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
CTCGTTTGGC TCCAGCAAAT ATGC 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

TTTGCCCTGG AACTTGAAAC G 
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(2) INFORMATION FOR SEQ ID NO: 15: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1833 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE; 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..1833 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:15: 

ATG AAG GAA CTT CAA CTG TAC GAA GAA GCA GAA CCC ACC GGA TAC GAA 48 
Met Lys Glu Leu Gin Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 
15 10 15 

ATC GTG AAG GAT CAT AAG ACC TTC GAA GAT CTC ATC GAA AAG CTG AAG 96 
lie Val Lys Asp His Lys Thr Phe Glu Asp Leu lie Glu Lys Leu Lys 

20 25 30 

GAG GTT CCA TCT TTT GCC CTG GAA CTT GAA ACG TCC TCC CTT GAC CCG 144 
Glu Val Pro Ser Phe Ala Leu Glu Leu Glu Thr Ser Ser Leu Asp Pro 
35 40 45 

TTC AAC TGT GAG ATA GTC GGC ATC TCC GTG TCG TTC AAA CCG AAA ACA 192 
Phe Asn Cys Glu lie Val Gly lie Ser Val Ser Phe Lys Pro Lys Thr 
50 55 60 

GCT TAT TAC ATT CCA CTT CAT CAC AGA AAC GCC CAG AAT CTT GAT GAA 240 
Ala Tyr Tyr He Pro Leu His His Arg Asn Ala Gin Asn Leu Asp Glu 
65 70 75 80 

ACA CTG GTG CTG TCG AAG TTG AAA GAG ATC CTC GAA GAC CCG TCT TCG 288 
Thr Leu Val Leu Ser Lys Leu Lys Glu He Leu Glu Asp Pro Ser Ser 
85 90 95 

AAG ATT GTG GGT CAG AAC CTG AAG TAC GAC TAC AAG GTT CTT ATG GTA 3 36 

Lys He Val Gly Gin Asn Leu Lys Tyr Asp Tyr Lys Val Leu Met Val 
100 105 no 

AAG GGT ATA TCG CCA GTT TAT CCG CAT TTT GAC ACG ATG ATA GCT GCA 3 84 

Lys Gly He Ser Pro Val Tyr Pro His Phe Asp Thr Met He Ala Ala 
115 120 125 

TAT TTG CTG GAG CCA AAC GAG AAA AAA TTC AAT CTC GAA GAT CTG TCT 43 2 

Tyr Leu Leu Glu Pro Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 
130 135 140 

TTG AAA TTT CTC GGA TAC AAA ATG ACG TCT TAT CAG GAA CTG ATG TCG 4 80 

Leu Lys Phe Leu Gly Tyr Lys Met Thr Ser Tyr Gin Glu Leu Met Ser 
145 150 155 160 

TTT TCC TCA CCA CTT TTT GGT TTC AGC TTT GCG GAT GTT CCG GTA GAC 52 6 

Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 
165 170 175 
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AAG GCT GCG AAC TAC TCC TGC GAG GAT GCA GAC ATC ACT TAT AGG CTC 5 76 

Lys Ala Ala Asn Tyr Ser Cys Glu Asp Ala Asp lie Thr Tyr Arg Leu 
180 185 190 

TAC AAG ATA CTC AGC ATG AAG CTC CAT GAA GCG GAA CTT GAG AAC GTC 624 
Tyr Lys lie Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val 
195 200 • 205 

TTC TAC AGG ATA GAG ATG CCG CTT GTG AAC GTT CTT GCA CGC ATG GAA 6 72 

Phe Tyr Arg lie Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu 
210 215 220 

TTG AAC GGG GTG TAT GTG GAC ACA GAA TTC CTG AAA AAG CTC TCG GAG 72 0 

Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu 
225 230 235 240 

GAG TAC GGC AAA AAG CTC GAG GAA CTG GCC GAA AAA ATC TAC CAG ATA 768 
Glu Tyr Gly Lys Lys Leu Glu Glu Leu Ala Glu Lys He Tyr Gin He 
245 250 255 

GCA GGA GAG CCC TTC AAC ATC AAT TCT CCA AAA CAG GTT TCA AAG ATC 816 
Ala Gly Glu Pro Phe Asn He Asn Ser Pro Lys Gin Val Ser Lys He 

260 265 270 

CTT TTT GAG AAG CTG GGA ATA AAA CCC CGT GGA AAA ACG ACA AAA ACA 864 
Leu Phe Glu Lys Leu Gly He Lys Pro Arg Gly Lys Thr Thr Lys Thr 
275 280 285 

GGA GCG TAC TCT ACC AGG ATA GAG GTG TTG GAA GAG ATA GCG AAT GAG 912 
Gly Ala Tyr Ser Thr Arg He Glu Val Leu Glu Glu He Ala Asn Glu 
290 295 300 

CAC GAG ATA GTA CCC CTC ATT CTC GAG TAC AGA AAG ATC CAG AAA CTG 960 
His Glu He Val Pro Leu He Leu Glu Tyr Arg Lys He Gin Lys Leu 
30S 310 315 320 



AAA TCG ACC TAC ATA GAC ACC CTT CCG AAA CTT GTG AAC CCG AAA ACC 
Lys Ser Thr Tyr He Asp Thr Leu Pro Lys Leu Val Asn Pro Lys Thr 
325 330 335 
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GGA AGA ATT CAT GCA TCT TTC CAC CAG ACG GGT ACC GCC ACT GGC AGG 1056 
Gly Arg He His Ala Ser Phe His Gin Thr Gly Thr Ala Thr Gly Arg 
340 345 350 

TTG AGT AGC AGT GAT CCA AAT CTT CAG AAT CTT CCG ACA AAG AGC GAA 1104 
Leu Ser Ser Ser Asp Pro Asn Leu Gin Asn Leu Pro Thr Lys Ser Glu 
355 360 365 

GAG GGA AAA GAA ATT AGA AAA GCG ATT GTG CCC CAG GAT CCA GAC TGG 1152 
Glu Gly Lys Glu He Arg Lys Ala He Val Pro Gin Asp Pro Asp Trp 
370 375 380 

TGG ATC GTC AGT GCG GAT TAT TCC CAA ATA GAA CTC AGA ATC CTC GCT 1200 
Trp He Val Ser Ala Asp Tyr Ser Gin He Glu Leu Arg He Leu Ala 
385 390 395 400 

CAT CTC AGT GGT GAT GAG AAC CTT GTG AAG GCC TTC GAG GAG GGC ATC 124 8 

His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly He 
405 410 415 
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GAT GTG CAC ACC TTG ACT GCC TCC AGG ATC TAC AAC GTA AAG CCA GAA 12 96 

Asp Val His Thr Leu Thr Ala Ser Arg lie Tyr Asn Val Lys Pro Glu 
420 425 430 

GAA GTG AAC GAA GAA ATG CGA CGG GTT GGA AAG ATG GTG AAC TTC TCT 1344 
Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 
435 440 445 

ATA ATA TAC GGT GTC ACA CCG TAC GGT CTT TCT GTG AGA CTT GGA ATA 13 92 

lie lie Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly lie 
450 455 460 

CCG GTT AAA GAA GCA GAA AAG ATG ATT ATC AGC TAT TTC ACA CTG TAT 14 40 

Pro Val Lys Glu Ala Glu Lys Met lie lie Ser Tyr Phe Thr Leu Tyr 
465 470 475 480 

CCA AAG GTG CGA AGC TAC ATC CAG GAG GTT GTT GCA GAG GCA AAA GAG 14 8 8 

Pro Lys Val Arg Ser Tyr He Gin Gin Val Val Ala Glu Ala Lys Glu 
485 490 495 

AAG GGC TAC GTC AGG ACT CTC TTT GGA AGA AAA AGA GAT ATT CCC CAG 15 36 

Lys Gly Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg Asp He Pro Gin 
500 505 510 

CTC ATG GCA AGG GAC AAG AAC ACC CAG TCC GAA GGC GAA AGA ATC GCA 1584 
Leu Met Ala Arg Asp Lys Asn Thr Gin Ser Glu Gly Glu Arg He Ala 
515 520 525 

ATA AAC ACC CCC ATT CAG GGA ACG GCG GCA GAT ATA ATA AAA TTG GCT 16 32 

He Asn Thr Pro He Gin Gly Thr Ala Ala Asp He He Lys Leu Ala 
530 535 540 

ATG ATA GAT ATA GAC GAG GAG CTG AGA AAA AGA AAC ATG AAA TCC AGA 168 0 

Met He Asp He Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 
545 550 555 560 

ATG ATC ATT CAG GTT CAT GAC GAA CTG GTC TTC GAG GTT CCC GAT GAG 1728 
Met He He Gin Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu 
565 570 575 

GAA AAA GAA GAA CTA GTT GAT CTG GTG AAG AAC AAA ATG ACA AAT GTG 1776 
Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr Asn Val 
580 585 590 

GTG AAA CTC TCT GTG CCT CTT GAG GTT GAC ATA AGC ATC GGA AAA AGC 1824 
Val Lys Leu Ser Val Pro Leu Glu Val Asp He Ser He Gly Lys Ser 
595 600 605 

TGG TCT TG 183 3 

Trp Ser 
610 

(2) INFORMATION FOR SEQ ID NO:16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 610 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 16: 

Met Lys Glu Leu Gin Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 
15 10 15 

He Val Lys Asp His Lys Thr Phe Glu Asp Leu He Glu Lys Leu Lys 
20 25 30 

Glu Val Pro Ser Phe Ala Leu Glu Leu Glu Thr Ser Ser Leu Asp Pro 
35 40 45 

Phe Asn Cys Glu He Val Gly He Ser Val Ser Phe Lys Pro Lys Thr 
50 55 60 

Ala Tyr Tyr He Pro Leu His His Arg Asn Ala Gin Asn Leu Asp Glu 
65 70 75 80 

Thr Leu Val Leu Ser Lys Leu Lys Glu He Leu Glu Asp Pro Ser Ser 
85 90 95 

Lys He Val Gly Gin Asn Leu Lys Tyr Asp Tyr Lys Val Leu Met Val 
100 105 HO 

Lys Gly He Ser Pro Val Tyr Pro His Phe Asp Thr Met He Ala Ala 
115 120 125 

Tyr Leu Leu Glu Pro Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 
130 135 140 

Leu Lys Phe Leu Gly Tyr Lys Met Thr Ser Tyr Gin Glu Leu Met Ser 
145 150 155 160 

Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 
165 170 175 

Lys Ala Ala Asn Tyr Ser Cys Glu Asp Ala Asp He Thr Tyr Arg Leu 
180 185 190 

Tyr Lys He Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val 
195 200 205 

Phe Tyr Arg He Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu 
210 215 220 

Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu 
225 230 235 240 

Glu Tyr Gly Lys Lys Leu Glu Glu Leu Ala Glu Lys He Tyr Gin He 
245 250 255 

Ala Gly Glu Pro Phe Asn He Asn Ser Pro Lys Gin Val Ser Lys He 
260 265 270 

Leu Phe Glu Lys Leu Gly lie Lys Pro Arg Gly Lys Thr Thr Lys Thr 
275 280 285 

Gly Ala Tyr Ser Thr Arg He Glu Val Leu Glu Glu He Ala Asn Glu 
290 295 300 

His Glu He Val Pro Leu He Leu Glu Tyr Arg Lys lie Gin Lys Leu 
305 310 315 320 
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Lys Ser Thr Tyr lie Asp Thr Leu Pro 
325 

Gly Arg lie His Ala Ser Phe His Gin 
340 345 

Leu Ser Ser Ser Asp Pro Asn Leu Gin 
355 360 

Glu Gly Lys Glu lie Arg Lys Ala He 
370 375 

Trp He Val Ser Ala Asp Tyr Ser Gin 
385 390 

His Leu Ser Gly Asp Glu Asn Leu Val 
405 

Asp Val His Thr Leu Thr Ala Ser Arg 
420 425 

Glu Val Asn Glu Glu Met Arg Arg Val 
435 440 

He He Tyr Gly Val Thr Pro Tyr Gly 
450 455 

Pro Val Lys Glu Ala Glu Lys Met He 
465 470 

Pro Lys Val Arg Ser Tyr He Gin Gin 
485 

Lys Gly Tyr Val Arg Thr Leu Phe Gly 
500 505 

Leu Met Ala Arg Asp Lys Asn Thr Gin 
515 520 

He Asn Thr Pro lie Gin Gly Thr Ala 
530 535 

Met He Asp He Asp Glu Glu Leu Arg 
545 550 

Met He He Gin Val His Asp Glu Leu 
565 

Glu Lys Glu Glu Leu Val Asp Leu Val 
580 585 

Val Lys Leu Ser Val Pro Leu Glu Val 
595 600 

Trp Ser 
610 



Lys Leu Val Asn 
330 

Thr Gly Thr Ala 



Asn Leu Pro Thr 
365 

Val Pro Gin Asp 
380 

He Glu Leu Arg 
395 

Lys Ala Phe Glu 
410 

lie Tyr Asn Val 



Gly Lys Met Val 
445 

Leu Ser Val Arg 
460 

He Ser Tyr Phe 
475 

Val Val Ala Glu 
490 

Arg Lys Arg Asp 



Ser Glu Gly Glu 
525 

Ala Asp He He 
540 

Lys Arg Asn Met 
555 

Val Phe Glu Val 
570 

Lys Asn Lys Met 



Asp He Ser lie 
605 



Pro Lys Thr 
335 

Thr Gly Arg 
350 

Lys Ser Glu 
Pro Asp Trp 



He Leu Ala 
400 

Glu Gly He 
415 

Lys Pro Glu 
430 

Asn Phe Ser 



Leu Gly He 

Thr Leu Tyr 
480 

Ala Lys Glu 
495 ' 

He Pro Gin 
510 

Arg lie Ala 
Lys Leu Ala 



Lys Ser Arg 

560 

Pro Asp Glu 
575 

Thr Asn Val 
590 

Gly Lys Ser 
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(2) INFORMATION FOR SEQ ID NO: 17: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE : DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
GACCTTGACA CGTCCTC 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 18 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY : linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..183 3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

ATG AAG GAA CTT CAA CTG TAC GAA GAA GCA GAA CCC ACC GGA TAG GAA 4 8 

Met Lys Glu Leu Gin Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 
15 io is 

ATC GTG AAG GAT CAT AAG ACC TTC GAA GAT CTC ATC GAA AAG CTG AAG 96 
lie Val Lys Asp His Lys Thr Phe Glu Asp Leu lie Glu Lys Leu Lys 
20 25 30 

GAG GTT CCA TCT TTT GCC CTG GAC CTT GAC ACG TCC TCC CTT GAC CCG 144 
Glu Val Pro Ser Phe Ala Leu Asp Leu Asp Thr Ser Ser Leu Asp Pro 
35 40 45 

TTC AAC TGT GAG ATA GTC GGC ATC TCC GTG TCG TTC AAA CCG AAA ACA 192 
Phe Asn Cys Glu lie Val Gly lie Ser Val Ser Phe Lys Pro Lys Thr 
50 55 60 

GCT TAT TAC ATT CCA CTT CAT CAC AGA AAC GCC CAG AAT CTT GAT GAA 240 
Ala Tyr Tyr He Pro Leu His His Arg Asn Ala Gin Asn Leu Asp Glu 
65 70 75 80 

ACA CTG GTG CTG TCG AAG TTG AAA GAG ATC CTC GAA GAC CCG TCT TCG 2 88 

Thr Leu Val Leu Ser Lys Leu Lys Glu He Leu Glu Asp Pro Ser Ser 
85 90 95 

AAG ATT GTG GGT CAG AAC CTG AAG TAC GAC TAC AAG GTT CTT ATG GTA 3 36 

Lys He Val Gly Gin Asn Leu Lys Tyr Asp Tyr Lys Val Leu Met Val 
100 105 HO 

AAG GGT ATA TCG CCA GTT TAT CCG CAT TTT GAC ACG ATG ATA GCT GCA 3 84 

Lys Gly He Ser Pro Val Tyr Pro His Phe Asp Thr Met He Ala Ala 
115 120 125 
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TAT TTG CTG GAG CCA AAC GAG AAA AAA TTC AAT CTC GAA GAT CTG TCT 4 32 

Tyr Leu Leu Glu Pro Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 
130 135 140 

TTG AAA TTT CTC GGA TAC AAA ATG ACG TCT TAT CAG GAA CTG ATG TCG 48 0 

Leu Lys Phe Leu Gly Tyr Lys Met Thr Ser Tyr Gin Glu Leu Met Ser 
"5 150 155 160 

TTT TCC TCA CCA CTT TTT GGT TTC AGC TTT GCG GAT GTT CCG GTA GAC 52 8 

Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 
165 170 175 

AAG GCT GCG AAC TAC TCC TGC GAG GAT GCA GAC ATC ACT TAT AGG CTC 5 76 

Lys Ala Ala Asn Tyr Ser Cys Glu Asp Ala Asp lie Thr Tyr Arg Leu 
180 185 190 

TAC AAG ATA CTC AGC ATG AAG CTC CAT GAA GCG GAA CTT GAG AAC GTC 6 24 

Tyr Lys lie Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val 
195 200 205 

TTC TAC AGG ATA GAG ATG CCG CTT GTG AAC GTT CTT GCA CGC ATG GAA 6 72 

Phe Tyr Arg lie Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu 
210 215 220 

TTG AAC GGG GTG TAT GTG GAC ACA GAA TTC CTG AAA AAG CTC TCG GAG 7 20 

Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu 
225 230 235 240 

GAG TAC GGC AAA AAG CTC GAG GAA CTG GCC GAA AAA ATC TAC CAG ATA 768 
Glu Tyr Gly Lys Lys Leu Glu Glu Leu Ala Glu Lys lie Tyr Gin lie 
245 250 255 

GCA GGA GAG CCC TTC AAC ATC AAT TCT CCA AAA CAG GTT TCA AAG ATC 816 
Ala Gly Glu Pro Phe Asn He Asn Ser Pro Lys Gin Val Ser Lys lie 
260 265 270 

CTT TTT GAG AAG CTG GGA ATA AAA CCC CGT GGA AAA ACG ACA AAA ACA 864 
Leu Phe Glu Lys Leu Gly He Lys Pro Arg Gly Lys Thr Thr Lys Thr 
275 280 285 

GGA GCG TAC TCT ACC AGG ATA GAG GTG TTG GAA GAG ATA GCG AAT GAG 912 
Gly Ala Tyr Ser Thr Arg He Glu Val Leu Glu Glu He Ala Asn Glu 
290 295 300 

CAC GAG ATA GTA CCC CTC ATT CTC GAG TAC AGA AAG ATC CAG AAA CTG 96 0 

His Glu He Val Pro Leu He Leu Glu Tyr Arg Lys He Gin Lys Leu 
305 310 315 320 



AAA TCG ACC TAC ATA GAC ACC CTT CCG AAA CTT GTG AAC CCG AAA ACC 
Lys Ser Thr Tyr He Asp Thr Leu Pro Lys Leu Val Asn Pro Lys Thr 
325 330 335 



1008 



GGA AGA ATT CAT GCA TCT TTC CAC CAG ACG GGT ACC GCC ACT GGC AGG 10 56 

Gly Arg He His Ala Ser Phe His Gin Thr Gly Thr Ala Thr Gly Arg 

340 345 350 

TTG AGT AGC AGT GAT CCA AAT CTT CAG AAT CTT CCG ACA AAG AGC GAA 1104 

Leu Ser Ser Ser Asp Pro Asn Leu Gin Asn Leu Pro Thr Lys Ser Glu 

355 360 365 
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GAG GGA AAA GAA ATT AGA AAA GCG ATT GTG CCC CAG GAT CCA GAC TGG 1152 
Glu Gly Lys Glu lie Arg Lys Ala lie Val Pro Gin Asp Pro Asp Trp 
370 375 380 

TGG ATC GTC AGT GCG GAT TAT TCC GAA ATA GAA CTC AGA ATC CTC GCT 1200 
Trp He Val Ser Ala Asp Tyr Ser Gin He Glu Leu Arg He Leu Ala 
385 390 395 400 

CAT CTC AGT GGT GAT GAG AAC CTT GTG AAG GCC TTC GAG GAG GGC ATC 1248 
His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly He 
405 410 415 

GAT GTG CAC ACC TTG ACT GCC TCC AGG ATC TAC AAC GTA AAG CCA GAA 12 96 

Asp Val His Thr Leu Thr Ala Ser Arg He Tyr Asn Val Lys Pro Glu 
420 42S 430 

GAA GTG AAC GAA GAA ATG CGA CGG GTT GGA AAG ATG GTG AAC TTC TCT 1344 
Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 
435 440 445 

ATA ATA TAC GGT GTC ACA CCG TAC GGT CTT TCT GTG AGA CTT GGA ATA 13 92 

lie lie Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly He 
450 455 460 

CCG GTT AAA GAA GGA GAA AAG ATG ATT ATC AGC TAT TTC ACA CTG TAT 1440 
Pro Val Lys Glu Ala Glu Lys Met He He Ser Tyr Phe Thr Leu Tyr 
465 470 475 480 

CCA AAG GTG CGA AGC TAC ATC CAG CAG GTT GTT GCA GAG GCA AAA GAG 14 8 8 

Pro Lys Val Arg Ser Tyr He Gin Gin Val Val Ala Glu Ala Lys Glu 
485 490 495 

AAG GGC TAC GTC AGG ACT CTC TTT GGA AGA AAA AGA GAT ATT CCC CAG 1536 
Lys Gly Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg Asp He Pro Gin 
500 505 510 

CTC ATG GCA AGG GAC AAG AAC ACC CAG TCC GAA GGC GAA AGA ATC GCA 1584 
Leu Met Ala Arg Asp Lys Asn Thr Gin Ser Glu Gly Glu Arg He Ala 
515 520 525 

ATA AAC ACC CCC ATT CAG GGA ACG GCG GCA GAT ATA ATA AAA TTG GCT 16 32 

He Asn Thr Pro He Gin Gly Thr Ala Ala Asp He He Lys Leu Ala 
530 535 540 

ATG ATA GAT ATA GAC GAG GAG CTG AGA AAA AGA AAC ATG AAA TCC AGA 1680 
Met He Asp He Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 
545 550 555 560 

ATG ATC ATT CAG GTT CAT GAC GAA CTG GTC TTC GAG GTT CCC GAT GAG 1728 
Met He He Gin Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu 
565 570 575 

GAA AAA GAA GAA CTA GTT GAT CTG GTG AAG AAC AAA ATG ACA AAT GTG 17 76 

Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr Asn Val 
580 585 590 

GTG AAA CTC TCT GTG CCT CTT GAG GTT GAC ATA AGC ATC GGA AAA AGC 1824 
Val Lys Leu Ser Val Pro Leu Glu Val Asp He Ser He Gly Lys Ser 
595 600 605 
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TGG TCT TG 
Trp Ser 
610 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 610 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Met Lys Glu Leu Gin Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 



1 5 



He Val Lys Asp His Lys Thr Phe Glu Asp Leu He Glu Lys Leu Lys 
20 25 30 

Glu Val Pro Ser Phe Ala Leu Asp Leu Asp Thr Ser Ser Leu Asp Pro 
35 40 45 

Phe Asn Cys Glu He Val Gly He Ser Val Ser Phe Lys Pro Lys Thr 
50 55 60 

Ala Tyr Tyr He Pro Leu His His Arg Asn Ala Gin Asn Leu Asp Glu 
65 70 75 80 

Thr Leu Val Leu Ser Lys Leu Lys Glu He Leu Glu Asp Pro Ser Ser 
85 9° 95 

Lvs He Val Gly Gin Asn Leu Lys Tyr Asp Tyr Lys Val Leu Met Val 
100 105 HO 

Lvs Gly He Ser Pro Val Tyr Pro His Phe Asp Thr Met He Ala Ala 
115 120 125 

Tyr Leu Leu Glu Pro Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 
130 135 140 

Leu Lys Phe Leu Gly Tyr Lys Met Thr Ser Tyr Gin Glu Leu Met Ser 

150 155 160 



145 



Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 
165 170 175 

Lys Ala Ala Asn Tyr Ser Cys Glu Asp Ala Asp He Thr Tyr Arg Leu 
180 185 I 90 

Tvr Lvs He Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val 
195 200 205 

Phe Tyr Arg He Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu 
210 215 220 

Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu 
225 230 235 ^ 4U 

Glu Tyr Gly Lys Lys Leu Glu Glu Leu Ala Glu Lys He Tyr Gin He 
245 250 255 
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Ala Gly Glu Pro Phe Asn lie Asn Ser Pro Lys Gin Val Ser Lys lie 
260 265 270 

Leu Phe Glu Lys Leu Gly lie Lys Pro Arg Gly Lys Thr Thr Lys Thr 
275 280 285 

Gly Ala Tyr Ser Thr Arg He Glu Val Leu Glu Glu He Ala Asn Glu 
290 295 300 

His Glu He Val Pro Leu He Leu Glu Tyr Arg Lys He Gin Lys Leu 
305 310 315 320 

Lys Ser Thr Tyr He Asp Thr Leu Pro Lys Leu Val Asn Pro Lys Thr 
325 330 335 

Gly Arg He His Ala Ser Phe His Gin Thr Gly Thr Ala Thr Gly Arg 
340 345 350 

Leu Ser Ser Ser Asp Pro Asn Leu Gin Asn Leu Pro Thr Lys Ser Glu 
355 360 365 

Glu Gly Lys Glu He Arg Lys Ala He Val Pro Gin Asp Pro Asp Trp 

370 375 380 

Trp He Val Ser Ala Asp Tyr Ser Gin He Glu Leu Arg He Leu Ala 
385 390 395 400 

His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly lie 
405 410 415 

Asp Val His Thr Leu Thr Ala Ser Arg He Tyr Asn Val Lys Pro Glu 
420 425 430 

Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 
435 440 445 

He He Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly He 
450 455 460 

Pro Val Lys Glu Ala Glu Lys Met He He Ser Tyr Phe Thr Leu Tyr 
465 470 475 480 

Pro Lys Val Arg Ser Tyr He Gin Gin Val Val Ala Glu Ala Lys Glu 
485 490 495 

Lys Gly Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg Asp He Pro Gin 
500 505 510 

Leu Met Ala Arg Asp Lys Asn Thr Gin Ser Glu Gly Glu Arg He Ala 

515 520 525 

He Asn Thr Pro He Gin Gly Thr Ala Ala Asp He He Lys Leu Ala 
530 535 540 

Met He Asp He Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 
545 550 555 560 

Met He He Gin Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu 
565 570 575 

Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr Asn Val 
580 585 590 
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Val Lys Leu Ser Val Pro Leu Glu Val Asp lie Ser lie Gly Lys Ser 
595 600 605 



Trp Ser 
610 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
TAAGTGATAT CTGCATCCTC GCAGGAGAAG TTCGCAGCC 3 9 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
ACAAGGCTGC GAACTTCTCC TGCGAGGATG CAGATATCA 3 9 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1833 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..1833 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

ATG AAG GAA CTT CAA CTG TAC GAA GAA GCA GAA CCC ACC GGA TAC GAA 4 8 

Met Lys Glu Leu Gin Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 
15 10 15 

ATC GTG AAG GAT CAT AAG ACC TTC GAA GAT CTC ATC GAA AAG CTG AAG 96 
lie Val Lys Asp His Lys Thr Phe Glu Asp Leu lie Glu Lys Leu Lys 
20 25 30 

GAG GTT CCA TCT TTT GCC CTG GAC CTT GAA ACG TCC TCC CTT GAC CCG 144 
Glu Val Pro Ser Phe Ala Leu Asp Leu Glu Thr Ser Ser Leu Asp Pro 
35 40 45 
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TTC AAC TGT GAG ATA GTC GGC ATC TCC GTG TCG TTC AAA CCG AAA ACA 192 
Phe Asn Cys Glu lie Val Gly lie Ser Val Ser Phe Lys Pro Lys Thr 
50 55 60 

GCT TAT TAC ATT CCA CTT CAT CAC AGA AAC GCC CAG AAT CTT GAT GAA 24 0 

Ala Tyr Tyr lie Pro Leu His His Arg Asn Ala Gin Asn Leu Asp Glu 
65 70 75 80 

ACA CTG GTG CTG TCG AAG TTG AAA GAG ATC CTC GAA GAC CCG TCT TCG 288 
Thr Leu Val Leu Ser Lys Leu Lys Glu lie Leu Glu Asp Pro Ser Ser 
85 90 95 

AAG ATT GTG GGT CAG AAC CTG AAG TAC GAC TAC AAG GTT CTT ATG GTA 3 36 

Lys lie Val Gly Gin Asn Leu Lys Tyr Asp Tyr Lys Val Leu Met Val 
100 105 110 

AAG GGT ATA TCG CCA GTT TAT CCG CAT TTT GAC ACG ATG ATA GCT GCA 3 84 

Lys Gly lie Ser Pro Val Tyr Pro His Phe Asp Thr Met He Ala Ala 
115 120 125 

TAT TTG CTG GAG CCA AAC GAG AAA AAA TTC AAT CTC GAA GAT CTG TCT 4 32 

Tyr Leu Leu Glu Pro Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 
130 135 140 

TTG AAA TTT CTC GGA TAC AAA ATG ACG TCT TAT CAG GAA CTG ATG TCG 4 80 

Leu Lys Phe Leu Gly Tyr Lys Met Thr Ser Tyr Gin Glu Leu Met Ser 
145 150 1S5 160 

TTT TCC TCA CCA CTT TTT GGT TTC AGC TTT GCG GAT GTT CCG GTA GAC 528 
Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 
165 170 175 

AAG GCT GCG AAC TTC TCC TGC GAG GAT GCA GAT ATC ACT TAT AGG CTC 576 
Lys Ala Ala Asn Phe Ser Cys Glu Asp Ala Asp He Thr Tyr Arg Leu 
180 185 190 

TAC AAG ATA CTC AGC ATG AAG CTC CAT GAA GCG GAA CTT GAG AAC GTC 624 
Tyr Lys He Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val 
195 200 205 

TTC TAC AGG ATA GAG ATG CCG CTT GTG AAC GTT CTT GCA CGC ATG GAA 672 
Phe Tyr Arg He Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu 
210 215 220 

TTG AAC GGG GTG TAT GTG GAC ACA GAA TTC CTG AAA AAG CTC TCG GAG 720 
Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu 
225 230 235 240 

GAG TAC GGC AAA AAG CTC GAG GAA CTG GCC GAA AAA ATC TAC CAG ATA 76 8 

Glu Tyr Gly Lys Lys Leu Glu Glu Leu Ala Glu Lys He Tyr Gin He 
245 250 255 

GCA GGA GAG CCC TTC AAC ATC AAT TCT CCA AAA CAG GTT TCA AAG ATC 816 
Ala Gly Glu Pro Phe Asn lie Asn Ser Pro Lys Gin Val Ser Lys He 
260 265 270 
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CTT TTT GAG AAG CTG GGA ATA AAA CCC CGT GGA AAA ACG ACA AAA ACA 86 4 

Leu Phe Glu Lys Leu Gly lie Lys Pro Arg Gly Lys Thr Thr Lys Thr 
275 280 285 

GGA GCG TAC TCT ACC AGG ATA GAG GTG TTG GAA GAG ATA GCG AAT GAG 912 
Gly Ala Tyr Ser Thr Arg lie Glu Val Leu Glu Glu lie Ala Asn Glu 
290 295 300 

CAC GAG ATA GTA CCC CTC ATT CTC GAG TAC AGA AAG ATC CAG AAA CTG 960 
His Glu lie Val Pro Leu lie Leu Glu Tyr Arg Lys lie Gin Lys Leu 
305 310 315 320 

AAA TCG ACC TAC ATA GAC ACC CTT CCG AAA CTT GTG AAC CCG AAA ACC 1008 
Lys Ser Thr Tyr lie Asp Thr Leu Pro Lys Leu Val Asn Pro Lys Thr 
325 330 335 

GGA AGA ATT CAT GCA TCT TTC CAC CAG ACG GGT ACC GCC ACT GGC AGG 1056 
Gly Arg He His Ala Ser Phe His Gin Thr Gly Thr Ala Thr Gly Arg 
340 345 350 

TTG AGT AGC AGT GAT CCA AAT CTT CAG AAT CTT CCG ACA AAG AGC GAA 1104 
Leu Ser Ser Ser Asp Pro Asn Leu Gin Asn Leu Pro Thr Lys Ser Glu 
355 360 365 

GAG GGA AAA GAA ATT AGA AAA GCG ATT GTG CCC CAG GAT CCA GAC TGG 1152 
Glu Gly Lys Glu He Arg Lys Ala He Val Pro Gin Asp Pro Asp Trp 
370 375 380 

TGG ATC GTC AGT GCG GAT TAT TCC CAA ATA GAA CTC AGA ATC CTC GCT 1200 
Trp He Val Ser Ala Asp Tyr Ser Gin He Glu Leu Arg He Leu Ala 
385 390 395 400 

CAT CTC AGT GGT GAT GAG AAC CTT GTG AAG GCC TTC GAG GAG GGC ATC 1248 
His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly He 
405 410 415 

GAT GTG CAC ACC TTG ACT GCC TCC AGG ATC TAC AAC GTA AAG CCA GAA 1296 
Asp Val His Thr Leu Thr Ala Ser Arg He Tyr Asn Val Lys Pro Glu 
420 425 430 

GAA GTG AAC GAA GAA ATG CGA CGG GTT GGA AAG ATG GTG AAC TTC TCT 1344 
Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 
435 440 445 

ATA ATA TAC GGT GTC ACA CCG TAC GGT CTT TCT GTG AGA CTT GGA ATA 13 92 

He He Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly He 
450 455 460 

CCG GTT AAA GAA GCA GAA AAG ATG ATT ATC AGC TAT TTC ACA CTG TAT 144 0 

Pro Val Lys Glu Ala Glu Lys Met He He Ser Tyr Phe Thr Leu Tyr 
465 470 475 480 

CCA AAG GTG CGA AGC TAC ATC CAG CAG GTT GTT GCA GAG GCA AAA GAG 14 8 8 

Pro Lys Val Arg Ser Tyr He Gin Gin Val Val Ala Glu Ala Lys Glu 
485 490 495 

AAG GGC TAC GTC AGG ACT CTC TTT GGA AGA AAA AGA GAT ATT CCC CAG 15 36 

Lys Gly Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg Asp He Pro Gin 

500 505 510 
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CTC ATG GCA AGG GAC AAG AAC ACC CAG TCC GAA GGC GAA AGA ATC GCA 1584 
Leu Met Ala Arg Asp Lys Asn Thr Gin Ser Glu Gly Glu Arg lie Ala 
515 520 525 

ATA AAC ACC CCC ATT CAG GGA ACG GCG GCA GAT ATA ATA AAA TTG GCT 163 2 

lie Asn Thr Pro lie Gin Gly Thr Ala Ala Asp lie He Lys Leu Ala 
530 535 540 

ATG ATA GAT ATA GAC GAG GAG CTG AGA AAA AGA AAC ATG AAA TCC AGA 16 80 

Met He Asp He Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 
54 5 550 555 560 

ATG ATC ATT CAG GTT CAT GAC GAA CTG GTC TTC GAG GTT CCC GAT GAG 172 8 

Met He He Gin Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu 
565 570 575 

GAA AAA GAA GAA CTA GTT GAT CTG GTG AAG AAC AAA ATG ACA AAT GTG 17 76 

Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr Asn Val 
580 585 590 

GTG AAA CTC TCT GTG CCT CTT GAG GTT GAC ATA AGC ATC GGA AAA AGC 1824 
Val Lys Leu Ser Val Pro Leu Glu Val Asp He Ser lie Gly Lys Ser 

595 600 605 

TGG TCT TG 1833 
Trp Ser 
610 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 610 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23; 

Met Lys Glu Leu Gin Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 
15 10 15 

lie Val Lys Asp His Lys Thr Phe Glu Asp Leu He Glu Lys Leu Lys 
20 25 30 

Glu Val Pro ser Phe Ala Leu Asp Leu Glu Thr Ser Ser Leu Asp Pro 
35 40 45 

Phe Asn Cys Glu He Val Gly He Ser Val Ser Phe Lys Pro Lys Thr 
SO 55 60 

Ala Tyr Tyr He Pro Leu His His Arg Asn Ala Gin Asn Leu Asp Glu 
65 70 75 80 

Thr Leu Val Leu Ser Lys Leu Lys Glu He Leu Glu Asp Pro Ser Ser 
85 90 95 

Lys He Val Gly Gin Asn Leu Lys Tyr Asp Tyr Lys Val Leu Met Val 
100 105 110 

Lys Gly He Ser Pro Val Tyr Pro His Phe Asp Thr Met He Ala Ala 
115 120 125 
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Tyr Leu Leu Glu Pro Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 
130 135 140 

Leu Lys Phe Leu Gly Tyr Lys Met Thr Ser Tyr Gin Glu Leu Met Ser 
145 150 155 160 

Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 
165 170 175 

Lys Ala Ala Asn Phe Ser Cys Glu Asp Ala Asp He Thr Tyr Arg Leu 
180 185 190 

Tyr Lys He Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val 
195 200 205 

Phe Tyr Arg He Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu 
210 215 220 

Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu 
225 230 235 240 

Glu Tyr Gly Lys Lys Leu Glu Glu Leu Ala Glu Lys He Tyr Gin He 
245 250 255 

Ala Gly Glu Pro Phe Asn He Asn Ser Pro Lys Gin Val Ser Lys He 
260 265 270 

Leu Phe Glu Lys Leu Gly He Lys Pro Arg Gly Lys Thr Thr Lys Thr 
275 280 285 

Gly Ala Tyr Ser Thr Arg He Glu Val Leu Glu Glu He Ala Asn Glu 
290 295 300 

His Glu He Val Pro Leu He Leu Glu Tyr Arg Lys He Gin Lys Leu 
305 310 315 320 

Lys Ser Thr Tyr He Asp Thr Leu Pro Lys Leu Val Asn Pro Lys Thr 
325 330 335 

Gly Arg He His Ala Ser Phe His Gin Thr Gly Thr Ala Thr Gly Arg 
340 345 350 

Leu Ser Ser Ser Asp Pro Asn Leu Gin Asn Leu Pro Thr Lys Ser Glu 
355 360 365 

Glu Gly Lys Glu He Arg Lys Ala He Val Pro Gin Asp Pro Asp Trp 
370 375 380 

Trp He Val Ser Ala Asp Tyr Ser Gin He Glu Leu Arg He Leu Ala 
385 390 395 400 

His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly He 
405 410 415 

Asp Val His Thr Leu Thr Ala Ser Arg He Tyr Asn Val Lys Pro Glu 
420 425 430 

Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 
435 440 445 

He He Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly He 
450 455 460 
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Pro Val Lys Glu Ala Glu Lys Met He He Ser Tyr Phe Thr Leu Tyr 
465 470 475 480 

Pro Lys Val Arg Ser Tyr He Gin Gin Val Val Ala Glu Ala Lys Glu 
485 490 495 

Lys Gly Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg Asp He Pro Gin 
SOO 505 510 

Leu Met Ala Arg Asp Lys Asn Thr Gin Ser Glu Gly Glu Arg He Ala 
515 520 525 

He Asn Thr Pro He Gin Gly Thr Ala Ala Asp He He Lys Leu Ala 
530 535 540 

Met He Asp He Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 
545 550 555 560 

Met He He Gin Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu 
565 570 575 

Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr Asn Val 
580 585 590 

Val Lys Leu Ser Val Pro Leu Glu Val Asp He Ser He Gly Lys Ser 
595 600 605 

Trp Ser 
610 

(2) INFORMATION FOR SEQ ID NO: 24; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



<ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 



ACTCCTGCGA GAATGCTGAC ATCACTTATA GG 



(2) INFORMATION FOR SEQ ID NO: 25: 



(i) SEQUENCE CHARACTERISTICS: 

{A} LENGTH: 1833 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE; 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..183 3 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 



ATG AAG GAA CTT CAA CTG TAC GAA GAA GCA GAA CCC ACC GGA TAC GAA 4 8 

Met Lys Glu Leu Gin Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 
15 io is 

ATC GTG AAG GAT CAT AAG ACC TTC GAA GAT CTC ATC GAA AAG CTG AAG 96 
lie Val Lys Asp His Lys Thr Phe Glu Asp Leu He Glu Lys Leu Lys 
20 25 30 

GAG GTT CCA TCT TTT GCC CTG GAC CTT GAA ACG TCC TCC CTT GAC CCG 144 
Glu Val Pro Ser Phe Ala Leu Asp Leu Glu Thr Ser Ser Leu Asp Pro 
35 40 45 

TTC AAC TGT GAG ATA GTC GGC ATC TCC GTG TCG TTC AAA CCG AAA ACA 192 
Phe Asn Cys Glu He Val Gly He Ser Val Ser Phe Lys Pro Lys Thr 
SO 55 60 

GCT TAT TAC ATT CCA CTT CAT CAC AGA AAC GCC CAG AAT CTT GAT GAA 240 
Ala Tyr Tyr He Pro Leu His His Arg Asn Ala Gin Asn Leu Asp Glu 
65 70 75 80 

ACA CTG GTG CTG TCG AAG TTG AAA GAG ATC CTC GAA GAC CCG TCT TCG 2 88 

Thr Leu Val Leu Ser Lys Leu Lys Glu He Leu Glu Asp Pro Ser Ser 
85 90 95 

AAG ATT GTG GGT CAG AAC CTG AAG TAC GAC TAC AAG GTT CTT ATG GTA 336 
Lys He Val Gly Gin Asn Leu Lys Tyr Asp Tyr Lys Val Leu Met Val 
100 105 no 

AAG GGT ATA TCG CCA GTT TAT CCG CAT TTT GAC ACG ATG ATA GCT GCA 384 
Lys Gly He Ser Pro Val Tyr Pro His Phe Asp Thr Met He Ala Ala 
115 120 125 

TAT TTG CTG GAG CCA AAC GAG AAA AAA TTC AAT CTC GAA GAT CTG TCT 432 
Tyr Leu Leu Glu Pro Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 
130 135 140 

TTG AAA TTT CTC GGA TAC AAA ATG ACG TCT TAT CAG GAA CTG ATG TCG 480 
Leu Lys Phe Leu Gly Tyr Lys Met Thr Ser Tyr Gin Glu Leu Met Ser 
i45 150 155 160 

TTT TCC TCA CCA CTT TTT GGT TTC AGC TTT GCG GAT GTT CCG GTA GAC 528 
Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 
165 170 175 

AAG GCT GCG AAC TAC TCC TGC GAG AAT GCT GAC ATC ACT TAT AGG CTC 5 76 

Lys Ala Ala Asn Tyr Ser Cys Glu Asn Ala Asp He Thr Tyr Arg Leu 
180 185 190 

TAC AAG ATA CTC AGC ATG AAG CTC CAT GAA GCG GAA CTT GAG AAC GTC 624 
Tyr Lys He Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val 
195 200 205 

TTC TAC AGG ATA GAG ATG CCG CTT GTG AAC GTT CTT GCA CGC ATG GAA 6 72 

Phe Tyr Arg He Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu 
210 215 220 

TTG AAC GGG GTG TAT GTG GAC ACA GAA TTC CTG AAA AAG CTC TCG GAG 72 0 

Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu 
225 230 235 240 
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GAG TAC GGC AAA AAG CTC GAG GAA CTG GCC GAA AAA ATC TAC CAG ATA 768 
Glu Tyr Gly Lys hys Leu Glu Glu Leu Ala Glu Lys He Tyr Gin He 
245 250 255 

GCA GGA GAG CCC TTC AAC ATC AAT TCT CCA AAA CAG GTT TCA AAG ATC 816 
Ala Gly Glu Pro Phe Asn He Asn Ser Pro Lys Gin Val Ser Lys He 
260 265 270 

CTT TTT GAG AAG CTG GGA ATA AAA CCC CGT GGA AAA ACG ACA AAA ACA 864 
Leu Phe Glu Lys Leu Gly He Lys Pro Arg Gly Lys Thr Thr Lys Thr 
275 280 285 

GGA GCG TAC TCT ACC AGG ATA GAG GTG TTG GAA GAG ATA GCG AAT GAG 912 
Gly Ala Tyr Ser Thr Arg He Glu Val Leu Glu Glu He Ala Asn Glu 

290 295 300 

CAC GAG ATA GTA CCC CTC ATT CTC GAG TAC AGA AAG ATC CAG AAA CTG 96 0 

His Glu He Val Pro Leu He Leu Glu Tyr Arg Lys He Gin Lys Leu 
305 310 315 320 

AAA TCG ACC TAC ATA GAC ACC CTT CCG AAA CTT GTG AAC CCG AAA ACC 1008 
Lys Ser Thr Tyr He Asp Thr Leu Pro Lys Leu Val Asn Pro Lys Thr 

325 330 335 

GGA AGA ATT CAT GCA TCT TTC CAC CAG ACG GGT ACC GCC ACT GGC AGG 1056 
Gly Arg He His Ala Ser Phe His Gin Thr Gly Thr Ala Thr Gly Arg 
340 345 350 

TTG AGT AGC AGT GAT CCA AAT CTT CAG AAT CTT CCG ACA AAG AGC GAA 1104 
Leu Ser Ser Ser Asp Pro Asn Leu Gin Asn Leu Pro Thr Lys Ser Glu 
355 360 365 

GAG GGA AAA GAA ATT AGA AAA GCG ATT GTG CCC CAG GAT CCA GAC TGG 1152 
Glu Gly Lys Glu He Arg Lys Ala He Val Pro Gin Asp Pro Asp Trp 
370 375 380 

TGG ATC GTC AGT GCG GAT TAT TCC CAA ATA GAA CTC AGA ATC CTC GCT 1200 
Trp He Val Ser Ala Asp Tyr Ser Gin He Glu Leu Arg He Leu Ala 
385 390 395 400 

CAT CTC AGT GGT GAT GAG AAC CTT GTG AAG GCC TTC GAG GAG GGC ATC 124 8 

His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly He 
405 410 415 

GAT GTG CAC ACC TTG ACT GCC TCC AGG ATC TAC AAC GTA AAG CCA GAA 12 96 

Asp Val His Thr Leu Thr Ala Ser Arg He Tyr Asn Val Lys Pro Glu 
420 425 430 

GAA GTG AAC GAA GAA ATG CGA CGG GTT GGA AAG ATG GTG AAC TTC TCT 13 44 

Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 
435 440 445 

ATA ATA TAC GGT GTC ACA CCG TAC GGT CTT TCT GTG AGA CTT GGA ATA 13 92 

He He Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly He 
450 455 460 

CCG GTT AAA GAA GCA GAA AAG ATG ATT ATC AGC TAT TTC ACA CTG TAT 144 0 

Pro Val Lys Glu Ala Glu Lys Met He He Ser Tyr Phe Thr Leu Tyr 
465 470 475 480 
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CCA AAG GTG CGA AGC TAC ATC CAG CAG GTT GTT GCA GAG GCA AAA GAG 14 8 8 

Pro Lys Val Arg Ser Tyr lie Gin Gin Val Val Ala Glu Ala Lys Glu 
485 490 495 

AAG GGC TAC GTC AGG ACT CTC TTT GGA AGA AAA AGA GAT ATT CCC CAG 1536 
Lys Gly Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg Asp He Pro Gin 
500 505 510 

CTC ATG GCA AGG GAC AAG AAC ACC CAG TCC GAA GGC GAA AGA ATC GCA 1584 
Leu Met Ala Arg Asp Lys Asn Thr Gin Ser Glu Gly Glu Arg He Ala 
515 520 525 

ATA AAC ACC CCC ATT CAG GGA ACG GCG GCA GAT ATA ATA AAA TTG GCT 1632 
He Asn Thr Pro He Gin Gly Thr Ala Ala Asp He He Lys Leu Ala 
530 535 540 

ATG ATA GAT ATA GAC GAG GAG CTG AGA AAA AGA AAC ATG AAA TCC AGA 168 0 

Met He Asp He Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 
545 550 555 560 

ATG ATC ATT CAG GTT CAT GAC GAA CTG GTC TTC GAG GTT CCC GAT GAG 1728 
Met He He Gin Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu 
565 570 575 

GAA AAA GAA GAA CTA GTT GAT CTG GTG AAG AAC AAA ATG ACA AAT GTG 1776 
Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr Asn Val 
580 585 590 

GTG AAA CTC TCT GTG CCT CTT GAG GTT GAC ATA AGC ATC GGA AAA AGC 1824 
Val Lys Leu Ser Val Pro Leu Glu Val Asp He Ser He Gly Lys Ser 
595 600 605 



TGG TCT TG 
Trp Ser 
610 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 610 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 

Met Lys Glu Leu Gin Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 
1 5 10 15 

He Val Lys Asp His Lys Thr Phe Glu Asp Leu He Glu Lys Leu Lys 
20 25 30 

Glu Val Pro Ser Phe Ala Leu Asp Leu Glu Thr Ser Ser Leu Asp Pro 
35 40 45 

Phe Asn Cys Glu He Val Gly He Ser Val Ser Phe Lys Pro Lys Thr 
50 55 60 

Ala Tyr Tyr He Pro Leu His His Arg Asn Ala Gin Asn Leu Asp Glu 
65 70 75 80 



1833 
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Thr Leu Val Leu Ser Lys Leu Lys Glu lie Leu Glu Asp Pro Ser Ser 
85 90 95 

Lys lie Val Gly Gin Asn Leu Lys Tyr Asp Tyr Lys Val Leu Met Val 
100 105 110 

Lys Gly He Ser Pro Val Tyr Pro His Phe Asp Thr Met He Ala Ala 
115 120 125 

Tyr Leu Leu Glu Pro Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 
130 135 140 

Leu Lys Phe Leu Gly Tyr Lys Met Thr Ser Tyr Gin Glu Leu Met Ser 
145 150 155 160 

Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 
165 170 175 

Lys Ala Ala Asn Tyr Ser Cys Glu Asn Ala Asp He Thr Tyr Arg Leu 
180 185 190 

Tyr Lys He Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val 
195 200 205 

Phe Tyr Arg He Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu 
210 215 220 

Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu 
225 230 235 240 

Glu Tyr Gly Lys Lys Leu Glu Glu Leu Ala Glu Lys He Tyr Gin He 
245 250 255 

Ala Gly Glu Pro Phe Asn He Asn Ser Pro Lys Gin Val Ser Lys He 
260 265 270 

Leu Phe Glu Lys Leu Gly He Lys Pro Arg Gly Lys Thr Thr Lys Thr 
275 280 285 

Gly Ala Tyr Ser Thr Arg He Glu Val Leu Glu Glu He Ala Asn Glu 
290 295 300 

His Glu He Val Pro Leu He Leu Glu Tyr Arg Lys He Gin Lys Leu 
305 310 315 320 

Lys Ser Thr Tyr He Asp Thr Leu Pro Lys Leu Val Asn Pro Lys Thr 
325 330 335 

Gly Arg He His Ala Ser Phe His Gin Thr Gly Thr Ala Thr Gly Arg 
340 345 350 

Leu Ser Ser Ser Asp Pro Asn Leu Gin Asn Leu Pro Thr Lys Ser Glu 
355 360 365 

Glu Gly Lys Glu He Arg Lys Ala He Val Pro Gin Asp Pro Asp Trp 
370 375 380 

Trp He Val Ser Ala Asp Tyr Ser Gin He Glu Leu Arg He Leu Ala 
385 390 395 400 

His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly He 
405 410 415 
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Asp Val His Thr Leu Thr Ala Ser Arg He Tyr Asn Val Lys Pro Glu 
420 425 430 

Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 
435 440 445 

He He Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly He 
450 455 460 

Pro Val Lys Glu Ala Glu Lys Met He He Ser Tyr Phe Thr Leu Tyr 
465 470 475 480 

Pro Lys Val Arg Ser Tyr He Gin Gin Val Val Ala Glu Ala Lys Glu 
485 490 495 

Lys Gly Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg Asp He Pro Gin 
500 505 510 

Leu Met Ala Arg Asp Lys Asn Thr Gin Ser Glu Gly Glu Arg He Ala 
515 520 525 

He Asn Thr Pro He Gin Gly Thr Ala Ala Asp He He Lys Leu Ala 
530 535 540 

Met He Asp He Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 
545 550 555 560 

Met He He Gin Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu 
565 570 575 

Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr Asn Val 
580 585 590 

Val Lys Leu Ser Val Pro Leu Glu Val Asp He Ser He Gly Lys Ser 
595 600 605 

Trp Ser 
610 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
TTTGCCCTGG CCCTTGAAAC G 
(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 183 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1. . 1833 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 

ATG AAG GAA CTT CAA CTG TAC GAA GAA GCA GAA CCC ACC GGA TAC GAA 4 8 

Met Lys Glu Leu Gin Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 
15 10 15 

ATC GTG AAG GAT CAT AAG ACC TTC GAA GAT CTC ATC GAA AAG CTG AAG 96 
He Val Lys Asp His Lys Thr Phe Glu Asp Leu He Glu Lys Leu Lys 
20 25 30 

GAG GTT CCA TCT TTT GCC CTG GCC CTT GAA ACG TCC TCC CTT GAC CCG 144 
Glu Val Pro Ser Phe Ala Leu Ala Leu Glu Thr Ser Ser Leu Asp Pro 
35 40 45 

TTC AAC TGT GAG ATA GTC GGC ATC TCC GTG TCG TTC AAA CCG AAA ACA 192 
Phe Asn Cys Glu He Val Gly He Ser Val Ser Phe Lys Pro Lys Thr 
50 55 60 

GCT TAT TAC ATT CCA CTT CAT CAC AGA AAC GCC CAG AAT CTT GAT GAA 24 0 

Ala Tyr Tyr He Pro Leu His His Arg Asn Ala Gin Asn Leu Asp Glu 
65 70 75 80 

ACA CTG GTG CTG TCG AAG TTG AAA GAG ATC CTC GAA GAC CCG TCT TCG 28 8 

Thr Leu Val Leu Ser Lys Leu Lys Glu He Leu Glu Asp Pro Ser Ser 
85 90 95 

AAG ATT GTG GGT CAG AAC CTG AAG TAC GAC TAC AAG GTT CTT ATG GTA 336 
Lys He Val Gly Gin Asn Leu Lys Tyr Asp Tyr Lys Val Leu Met Val 
100 105 110 

AAG GGT ATA TCG CCA GTT TAT CCG CAT TTT GAC ACG ATG ATA GCT GCA 384 
Lys Gly He Ser Pro Val Tyr Pro His Phe Asp Thr Met He Ala Ala 
115 120 125 

TAT TTG CTG GAG CCA AAC GAG AAA AAA TTC AAT CTC GAA GAT CTG TCT 432 
Tyr Leu Leu Glu Pro Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 
130 135 140 

TTG AAA TTT CTC GGA TAC AAA ATG ACG TCT TAT CAG GAA CTG ATG TCG 4 80 

Leu Lys Phe Leu Gly Tyr Lys Met Thr Ser Tyr Gin Glu Leu Met Ser 
145 150 155 160 

TTT TCC TCA CCA CTT TTT GGT TTC AGC TTT GCG GAT GTT CCG GTA GAC 528 
Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 
165 170 175 

AAG GCT GCG AAC TAC TCC TGC GAG GAT GCA GAC ATC ACT TAT AGG CTC 576 
Lys Ala Ala Asn Tyr Ser Cys Glu Asp Ala Asp He Thr Tyr Arg Leu 
180 185 190 

TAC AAG ATA CTC AGC ATG AAG CTC CAT GAA GCG GAA CTT GAG AAC GTC 624 
Tyr Lys He Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val 
195 200 205 

TTC TAC AGG ATA GAG ATG CCG CTT GTG AAC GTT CTT GCA CGC ATG GAA 6 72 

Phe Tyr Arg He Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu 
210 215 220 
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TTG AAC GGG GTG TAT GTG GAC ACA GAA TTC CTG AAA AAG CTC TCG GAG 720 
Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu 
225 230 235 240 

GAG TAC GGC AAA AAG CTC GAG GAA CTG GCC GAA AAA ATC TAC CAG ATA 768 
Glu Tyr Gly Lys Lys Leu Glu Glu Leu Ala Glu Lys lie Tyr Gin He 
245 250 255 

GCA GGA GAG CCC TTC AAC ATC AAT TCT CCA AAA CAG GTT TCA AAG ATC 816 
Ala Gly Glu Pro Phe Asn He Asn Ser Pro Lys Gin Val Ser Lys He 
260 265 270 

CTT TTT GAG AAG CTG GGA ATA AAA CCC CGT GGA AAA ACG ACA AAA ACA 864 
Leu Phe Glu Lys Leu Gly He Lys Pro Arg Gly Lys Thr Thr Lys Thr 
275 280 285 

GGA GCG TAC TCT ACC AGG ATA GAG GTG TTG GAA GAG ATA GCG AAT GAG 912 
Gly Ala Tyr Ser Thr Arg He Glu Val Leu Glu Glu He Ala Asn Glu 
290 295 300 

CAC GAG ATA GTA CCC CTC ATT CTC GAG TAC AGA AAG ATC CAG AAA CTG 960 
His Glu He Val Pro Leu He Leu Glu Tyr Arg Lys He Gin Lys Leu 
305 310 315 320 

AAA TCG ACC TAC ATA GAC ACC CTT CCG AAA CTT GTG AAC CCG AAA ACC 1008 
Lys Ser Thr Tyr He Asp Thr Leu Pro Lys Leu Val Asn Pro Lys Thr 
325 330 335 

GGA AGA ATT CAT GCA TCT TTC CAC CAG ACG GGT ACC GCC ACT GGC AGG 1056 
Gly Arg He His Ala Ser Phe His Gin Thr Gly Thr Ala Thr Gly Arg 
340 345 350 

TTG AGT AGC AGT GAT CCA AAT CTT CAG AAT CTT CCG ACA AAG AGC GAA 1104 
Leu Ser Ser Ser Asp Pro Asn Leu Gin Asn Leu Pro Thr Lys Ser Glu 
355 360 365 

GAG GGA AAA GAA ATT AGA AAA GCG ATT GTG CCC CAG GAT CCA GAC TGG 1152 
Glu Gly Lys Glu He Arg Lys Ala He Val Pro Gin Asp Pro Asp Trp 
370 375 380 

TGG ATC GTC AGT GCG GAT TAT TCC CAA ATA GAA CTC AGA ATC CTC GCT 12 00 

Trp He Val Ser Ala Asp Tyr Ser Gin He Glu Leu Arg He Leu Ala 
385 390 395 400 

CAT CTC AGT GGT GAT GAG AAC CTT GTG AAG GCC TTC GAG GAG GGC ATC 124 8 

His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly He 
405 410 415 

GAT GTG CAC ACC TTG ACT GCC TCC AGG ATC TAC AAC GTA AAG CCA GAA 12 96 

Asp Val His Thr Leu Thr Ala Ser Arg He Tyr Asn Val Lys Pro Glu 
420 425 430 

GAA GTG AAC GAA GAA ATG CGA CGG GTT GGA AAG ATG GTG AAC TTC TCT 1344 
Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 
435 440 445 

ATA ATA TAC GGT GTC ACA CCG TAC GGT CTT TCT GTG AGA CTT GGA ATA 1392 
He He Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly He 
450 455 460 

CCG GTT AAA GAA GCA GAA AAG ATG ATT ATC AGC TAT TTC ACA CTG TAT 144 0 

Pro Val Lys Glu Ala Glu Lys Met He He Ser Tyr Phe Thr Leu Tyr 



- 153 - 



465 



470 



475 



480 



CCA AAG GTG CGA AGC TAC ATC CAG CAG GTT GTT GCA GAG GCA AAA GAG 1488 
Pro Lys Val Arg Ser Tyr He Gin Gin Val Val Ala Glu Ala Lys Glu 
485 490 495 

AAG GGC TAC GTC AGG ACT CTC TTT GGA AGA AAA AGA GAT ATT CCC CAG 1536 
Lys Gly Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg Asp He Pro Gin 
500 505 510 

CTC ATG GCA AGG GAC AAG AAC ACC CAG TCC GAA GGC GAA AGA ATC GCA 1584 
Leu Met Ala Arg Asp Lys Asn Thr Gin Ser Glu Gly Glu Arg He Ala 
515 520 525 

ATA AAC ACC CCC ATT CAG GGA ACG GCG GCA GAT ATA ATA AAA TTG GCT 16 3 2 

He Asn Thr Pro He Gin Gly Thr Ala Ala Asp He He Lys Leu Ala 
530 535 540 

ATG ATA GAT ATA GAC GAG GAG CTG AGA AAA AGA AAC ATG AAA TCC AGA 168 0 

Met He Asp He Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 
545 550 555 560 

ATG ATC ATT CAG GTT CAT GAC GAA CTG GTC TTC GAG GTT CCC GAT GAG 172 8 

Met He He Gin Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu 
565 570 575 

GAA AAA GAA GAA CTA GTT GAT CTG GTG AAG AAC AAA ATG ACA AAT GTG 1776 
Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr Asn Val 
580 585 590 

GTG AAA CTC TCT GTG CCT CTT GAG GTT GAC ATA AGC ATC GGA AAA AGC 1824 
Val Lys Leu Ser Val Pro Leu Glu Val Asp He Ser He Gly Lys Ser 
595 600 605 

TGG TCT TG 1833 
Trp Ser 
610 

(2) INFORMATION FOR SEQ ID NO: 29; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 610 amino acids 

(B) TYPE : amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Met Lys Glu Leu Gin Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 
1 5 10 15 

lie Val Lys Asp His Lys Thr Phe Glu Asp Leu He Glu Lys Leu Lys 

20 25 30 

Glu Val Pro Ser Phe Ala Leu Ala Leu Glu Thr Ser Ser Leu Asp Pro 
35 40 45 

Phe Asn Cys Glu He Val Gly He Ser Val Ser Phe Lys Pro Lys Thr 
50 55 60 

Ala Tyr Tyr He Pro Leu His His Arg Asn Ala Gin Asn Leu Asp Glu 
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65 

Thr Leu Val 

Lys lie Val 

Lys Gly lie 
115 

Tyr Leu Leu 
130 

Leu Lys Phe 
145 

Phe Ser Ser 

Lys Ala Ala 

Tyr Lys He 
195 

Phe Tyr Arg 
210 

Leu Asn Gly 
225 

Glu Tyr Gly 

Ala Gly Glu 

Leu Phe Glu 
275 

Gly Ala Tyr 
290 

His Glu He 
305 

Lys Ser Thr 
Gly Arg He 



Leu Ser Ser 
355 

Glu Gly Lys 
370 

Trp He Val 
385 



70 

Leu Ser Lys Leu Lys 
85 

Gly Gin Asn Leu Lys 

100 

Ser Pro Val Tyr Pro 
120 

Glu Pro Asn Glu Lys 
135 

Leu Gly Tyr Lys Met 
150 

Pro Leu Phe Gly Phe 
165 

Asn Tyr Ser Cys Glu 
180 

Leu Ser Met Lys Leu 
200 

He Glu Met Pro Leu 
215 

Val Tyr Val Asp Thr 
230 

Lys Lys Leu Glu Glu 
245 

Pro Phe Asn He Asn 
260 

Lys Leu Gly He Lys 

280 

Ser Thr Arg He Glu 
295 

Val Pro Leu He Leu 
310 

Tyr He Asp Thr Leu 
325 

His Ala Ser Phe His 
340 

Ser Asp Pro Asn Leu 
360 

Glu He Arg Lys Ala 
375 

Ser Ala Asp Tyr Ser 
390 



75 



80 



Glu He Leu Glu Asp Pro Ser Ser 
90 95 

Tyr Asp Tyr Lys Val Leu Met Val 
105 no 

His Phe Asp Thr Met He Ala Ala 
125 

Lys Phe Asn Leu Glu Asp Leu Ser 
140 

Thr Ser Tyr Gin Glu Leu Met Ser 
155 160 

Ser Phe Ala Asp Val Pro Val Asp 
170 175 

Asp Ala Asp He Thr Tyr Arg Leu 
185 190 

His Glu Ala Glu Leu Glu Asn Val 
205 

Val Asn Val Leu Ala Arg Met Glu 
220 

Glu Phe Leu Lys Lys Leu Ser Glu 
235 240 

Leu Ala Glu Lys He Tyr Gin He 
250 255 

Ser Pro Lys Gin Val Ser Lys He 
265 270 

Pro Arg Gly Lys Thr Thr Lys Thr 
285 

Val Leu Glu Glu He Ala Asn Glu 
300 

Glu Tyr Arg Lys He Gin Lys Leu 
315 320 

Pro Lys Leu Val Asn Pro Lys Thr 
330 335 

Gin Thr Gly Thr Ala Thr Gly Arg 

345 350 

Gin Asn Leu Pro Thr Lys Ser Glu 
365 

He Val Pro Gin Asp Pro Asp Trp 
380 

Gin He Glu Leu Arg He Leu Ala 
395 400 
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His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly lie 
405 410 415 

Asp Val His Thr Leu Thr Ala Ser Arg He Tyr Asn Val Lys Pro Glu 
420 425 430 

Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 
435 440 445 

He lie Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly He 
450 455 460 

Pro Val Lys Glu Ala Glu Lys Met He lie Ser Tyr Phe Thr Leu Tyr 
465 470 475 480 

Pro Lys Val Arg Ser Tyr lie Gin Gin Val Val Ala Glu Ala Lys Glu 
485 490 495 

Lys Gly Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg Asp lie Pro Gin 
500 505 510 

Leu Met Ala Arg Asp Lys Asn Thr Gin Ser Glu Gly Glu Arg He Ala 
515 520 525 

He Asn Thr Pro He Gin Gly Thr Ala Ala Asp He He Lys Leu Ala 
530 535 540 

Met He Asp He Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 
545 550 555 560 

Met He He Gin Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu 
565 570 575 

Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr Asn Val 
580 585 590 

Val Lys Leu Ser Val Pro Leu Glu Val Asp He Ser He Gly Lys Ser 
595 600 605 

Trp Ser 
610 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 0 : 
CCTGAAGTAC GCGTACAAGG TTCTTATGG 
(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
TCACACAGGA AACAGCTATG AC 22 
(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1833 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
(D} TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..1833 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: 

ATG AAG GAA CTT CAA CTG TAC GAA GAA GCA GAA CCC ACC GGA TAC GAA 4 8 

Met Lys Glu Leu Gin Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 
15 10 15 

ATC GTG AAG GAT CAT AAG ACC TTC GAA GAT CTC ATC GAA AAG CTG AAG 96 
lie Val Lys Asp His Lys Thr Phe Glu Asp Leu lie Glu Lys Leu Lys 
20 25 30 

GAG GTT CCA TCT TTT GCC CTG GAC CTT GAA ACG TCC TCC CTT GAC CCG 144 
Glu Val Pro Ser Phe Ala Leu Asp Leu Glu Thr Ser Ser Leu Asp Pro 
35 40 45 

TTC AAC TGT GAG ATA GTC GGC ATC TCC GTG TCG TTC AAA CCG AAA ACA 192 
Phe Asn Cys Glu lie Val Gly He Ser Val Ser Phe Lys Pro Lys Thr 
50 55 60 

GCT TAT TAC ATT CCA CTT CAT CAC AGA AAC GCC CAG AAT CTT GAT GAA 24 0 

Ala Tyr Tyr He Pro Leu His His Arg Asn Ala Gin Asn Leu Asp Glu 
65 70 75 80 

ACA CTG GTG CTG TCG AAG TTG AAA GAG ATC CTC GAA GAC CCG TCT TCG 28 8 

Thr Leu Val Leu Ser Lys Leu Lys Glu He Leu Glu Asp Pro Ser Ser 
85 90 95 

AAG ATT GTG GGT CAG AAC CTG AAG TAC GCG TAC AAG GTT CTT ATG GTA 3 36 

Lys He Val Gly Gin Asn Leu Lys Tyr Ala Tyr Lys Val Leu Met Val 
100 105 110 

AAG GGT ATA TCG CCA GTT TAT CCG CAT TTT GAC ACG ATG ATA GCT GCA 3 84 

Lys Gly He Ser Pro Val Tyr Pro His Phe Asp Thr Met He Ala Ala 
115 120 125 

TAT TTG CTG GAG CCA AAC GAG AAA AAA TTC AAT CTC GAA GAT CTG TCT 4 32 

Tyr Leu Leu Glu Pro Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 
130 135 140 

TTG AAA TTT CTC GGA TAC AAA ATG ACG TCT TAT CAG GAA CTG ATG TCG 4 80 

Leu Lys Phe Leu Gly Tyr Lys Met Thr Ser Tyr Gin Glu Leu Met Ser 
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145 



150 



155 



160 



TTT TCC TCA CCA CTT TTT GGT TTC AGC TTT GCG GAT GTT CCG GTA GAC 52 8 

Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 
165 170 175 

AAG GCT GCG AAC TAC TCC TGC GAG GAT GCA GAC ATC ACT TAT AGG CTC 5 76 

Lys Ala Ala Asn Tyr Ser Cys Glu Asp Ala Asp lie Thr Tyr Arg Leu 
180 185 190 

TAC AAG ATA CTC AGC ATG AAG CTC CAT GAA GCG GAA CTT GAG AAC GTC 624 
Tyr Lys lie Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val 
195 200 205 

TTC TAC AGG ATA GAG ATG CCG CTT GTG AAC GTT CTT GCA CGC ATG GAA 672 
Phe Tyr Arg lie Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu 
210 215 220 

TTG AAC GGG GTG TAT GTG GAC ACA GAA TTC CTG AAA AAG CTC TCG GAG 72 0 

Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu 
225 230 235 240 

GAG TAC GGC AAA AAG CTC GAG GAA CTG GCC GAA AAA ATC TAC CAG ATA 76 8 

Glu Tyr Gly Lys Lys Leu Glu Glu Leu Ala Glu Lys lie Tyr Gin lie 
245 250 255 

GCA GGA GAG CCC TTC AAC ATC AAT TCT CCA AAA CAG GTT TCA AAG ATC 816 
Ala Gly Glu Pro Phe Asn lie Asn Ser Pro Lys Gin Val Ser Lys lie 
260 265 270 

CTT TTT GAG AAG CTG GGA ATA AAA CCC CGT GGA AAA ACG ACA AAA ACA 864 
Leu Phe Glu Lys Leu Gly lie Lys Pro Arg Gly Lys Thr Thr Lys Thr 
275 280 285 

GGA GCG TAC TCT ACC AGG ATA GAG GTG TTG GAA GAG ATA GCG AAT GAG 912 
Gly Ala Tyr Ser Thr Arg lie Glu Val Leu Glu Glu lie Ala Asn Glu 
290 295 300 

CAC GAG ATA GTA CCC CTC ATT CTC GAG TAC AGA AAG ATC CAG AAA CTG 960 
His Glu lie Val Pro Leu lie Leu Glu Tyr Arg Lys lie Gin Lys Leu 
305 310 315 320 

AAA TCG ACC TAC ATA GAC ACC CTT CCG AAA CTT GTG AAC CCG AAA ACC 1008 
Lys Ser Thr Tyr lie Asp Thr Leu Pro Lys Leu Val Asn Pro Lys Thr 
325 330 335 

GGA AGA ATT CAT GCA TCT TTC CAC CAG ACG GGT ACC GCC ACT GGC AGG 10 56 

Gly Arg lie His Ala Ser Phe His Gin Thr Gly Thr Ala Thr Gly Arg 
340 345 350 

TTG AGT AGC AGT GAT CCA AAT CTT CAG AAT CTT CCG ACA AAG AGC GAA 1104 
Leu Ser Ser Ser Asp Pro Asn Leu Gin Asn Leu Pro Thr Lys Ser Glu 
355 360 365 

GAG GGA AAA GAA ATT AGA AAA GCG ATT GTG CCC CAG GAT CCA GAC TGG 1152 
Glu Gly Lys Glu He Arg Lys Ala He Val Pro Gin Asp Pro Asp Trp 
370 375 380 

TGG ATC GTC AGT GCG GAT TAT TCC CAA ATA GAA CTC AGA ATC CTC GCT 120 0 

Trp He Val Ser Ala Asp Tyr Ser Gin He Glu Leu Arg He Leu Ala 
385 390 395 400 
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CAT CTC AGT GGT GAT GAG AAC CTT GTG AAG GCC TTC GAG GAG GGC ATC 124 8 

His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly lie 
405 410 415 

GAT GTG CAC ACC TTG ACT GCC TCC AGG ATC TAC AAC GTA AAG CCA GAA 1296 
Asp Val His Thr Leu Thr Ala Ser Arg lie Tyr Asn Val Lys Pro Glu 
420 425 430 

GAA GTG AAC GAA GAA ATG CGA CGG GTT GGA AAG ATG GTG AAC TTC TCT 1344 
Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 
435 440 445 

ATA ATA TAC GGT GTC ACA CCG TAC GGT CTT TCT GTG AGA CTT GGA ATA 13 92 

lie lie Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly lie 
450 455 460 

CCG GTT AAA GAA GCA GAA AAG ATG ATT ATC AGC TAT TTC ACA CTG TAT 144 0 

Pro Val Lys Glu Ala Glu Lys Met lie He Ser Tyr Phe Thr Leu Tyr 
465 470 475 480 

CCA AAG GTG CGA AGC TAC ATC CAG CAG GTT GTT GCA GAG GCA AAA GAG 14 8 8 

Pro Lys Val Arg Ser Tyr He Gin Gin Val Val Ala Glu Ala Lys Glu 
485 490 495 

AAG GGC TAC GTC AGG ACT CTC TTT GGA AGA AAA AGA GAT ATT CCC CAG 1536 
Lys Gly Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg Asp He Pro Gin 
500 505 510 

CTC ATG GCA AGG GAC AAG AAC ACC CAG TCC GAA GGC GAA AGA ATC GCA 1584 
Leu Met Ala Arg Asp Lys Asn Thr Gin Ser Glu Gly Glu Arg He Ala 
515 520 525 

ATA AAC ACC CCC ATT CAG GGA ACG GCG GCA GAT ATA ATA AAA TTG GCT 16 32 

He Asn Thr Pro He Gin Gly Thr Ala Ala Asp He He Lys Leu Ala 
530 535 540 

ATG ATA GAT ATA GAC GAG GAG CTG AGA AAA AGA AAC ATG AAA TCC AGA 16 80 

Met He Asp He Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 
545 550 555 560 

ATG ATC ATT CAG GTT CAT GAC GAA CTG GTC TTC GAG GTT CCC GAT GAG 17 2 8 

Met He He Gin Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu 
565 570 575 

GAA AAA GAA GAA CTA GTT GAT CTG GTG AAG AAC AAA ATG ACA AAT GTG 17 7 6 

Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr Asn Val 
580 585 590 

GTG AAA CTC TCT GTG CCT CTT GAG GTT GAC ATA AGC ATC GGA AAA AGC 1824 
Val Lys Leu Ser Val Pro Leu Glu Val Asp He Ser He Gly Lys Ser 
595 600 605 

TGG TCT TG 1833 
Trp Ser 
610 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 610 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 



- 159 - 



(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 
Met Lys Glu Leu Gin Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 

lie Val Lys Asp His Lys Thr Phe Glu Asp Leu He Glu Lys Leu Lys 
20 25 30 

Glu Val Pro Ser Phe Ala Leu Asp Leu Glu Thr Ser Ser Leu Asp Pro 
35 40 45 

Phe Asn Cys Glu lie Val Gly He Ser Val Ser Phe Lys Pro Lys Thr 
50 55 60 

Ala Tyr Tyr lie Pro Leu His His Arg Asn Ala Gin Asn Leu Asp Glu 



65 



70 



Thr Leu Val Leu Ser Lys Leu Lys Glu lie Leu Glu Asp Pro Ser Ser 
85 90 95 

Lys lie Val Gly Gin Asn Leu Lys Tyr Ala Tyr Lys Val Leu Met Val 
100 105 110 

Lvs Gly He Ser Pro Val Tyr Pro His Phe Asp Thr Met He Ala Ala 
115 120 125 

Tyr Leu Leu Glu Pro Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 
130 135 140 



Leu Lys Phe Leu Gly Tyr Lys Met Thr Ser Tyr Gin Glu Leu Met Ser 
145 150 155 

Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 



165 



Lys Ala Ala Asn Tyr Ser Cys Glu Asp Ala Asp lie Thr Tyr Arg Leu 
180 I 85 190 

Tyr Lys lie Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val 
195 200 205 

Phe Tyr Arg lie Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu 
210 215 220 

Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu 
225 230 2 35 

Glu Tyr Gly Lys Lys Leu Glu Glu Leu Ala Glu Lys He Tyr Gin lie 

250 



245 



Ala Gly Glu Pro Phe Asn He Asn Ser Pro Lys Gin Val Ser Lys lie 
260 265 

Leu Phe Glu Lys Leu Gly He Lys Pro Arg Gly Lys Thr Thr Lys Thr 
275 280 285 

Gly Ala Tyr Ser Thr Arg He Glu Val Leu Glu Glu He Ala Asn Glu 
290 295 300 

His Glu He Val Pro Leu He Leu Glu Tyr Arg Lys lie Gin Lys Leu 
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305 310 315 320 

Lys Ser Thr Tyr He Asp Thr Leu Pro Lys Leu Val Asn Pro Lys Thr 
325 330 335 

Gly Arg He His Ala Ser Phe His Gin Thr Gly Thr Ala Thr Gly Arg 
340 345 350 

Leu Ser Ser Ser Asp Pro Asn Leu Gin Asn Leu Pro Thr Lys Ser Glu 
355 360 365 

Glu Gly Lys Glu He Arg Lys Ala He Val Pro Gin Asp Pro Asp Trp 
370 375 380 

Trp He Val Ser Ala Asp Tyr Ser Gin He Glu Leu Arg He Leu Ala 
385 390 395 400 

His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly He 
405 410 415 

Asp Val His Thr Leu Thr Ala Ser Arg He Tyr Asn Val Lys Pro Glu 
420 425 430 

Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 
435 440 445 

He He Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly He 
450 455 460 

Pro Val Lys Glu Ala Glu Lys Met He lie Ser Tyr Phe Thr Leu Tyr 
46 5 470 475 4.80 

Pro Lys Val Arg Ser Tyr He Gin Gin Val Val Ala Glu Ala Lys Glu 
485 490 495 

Lys Gly Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg Asp He Pro Gin 
500 505 510 

Leu Met Ala Arg Asp Lys Asn Thr Gin Ser Glu Gly Glu Arg He Ala 
515 520 525 

He Asn Thr Pro He Gin Gly Thr Ala Ala Asp He He Lys Leu Ala 
530 535 540 

Met He Asp He Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 
545 550 555 560 

Met He lie Gin Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu 
565 570 575 

Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr Asn Val 

580 585 590 

Val Lys Leu Ser Val Pro Leu Glu Val Asp He Ser He Gly Lys Ser 
595 600 605 

Trp Ser 
610 

(2) INFORMATION FOR SEQ ID NO: 34: 
(i) SEQUENCE CHARACTERISTICS : 
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(A) LENGTH: 1833 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/ KEY : CDS 

(B) LOCATION: 1..1833 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

ATG AAG GAA CTT CAA CTG TAC GAA GAA GCA GAA CCC ACC GGA TAC GAA 4 8 

Met Lys Glu Leu Gin Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 
1 5 10 15 

ATC GTG AAG GAT CAT AAG ACC TTC GAA GAT CTC ATC GAA AAG CTG AAG 96 
lie Val Lys Asp His Lys Thr Phe Glu Asp Leu lie Glu Lys Leu Lys 
20 25 30 

GAG GTT CCA TCT TTT GCC CTG GCC CTT GAA ACG TCC TCC CTT GAC CCG 144 
Glu Val Pro Ser Phe Ala Leu Ala Leu Glu Thr Ser Ser Leu Asp Pro 
35 40 45 

TTC AAC TGT GAG ATA GTC GGC ATC TCC GTG TCG TTC AAA CCG AAA ACA 192 
Phe Asn Cys Glu He Val Gly He Ser Val Ser Phe Lys Pro Lys Thr 
50 55 60 

GCT TAT TAC ATT CCA CTT CAT CAC AGA AAC GCC CAG AAT CTT GAT GAA 240 
Ala Tyr Tyr He Pro Leu His His Arg Asn Ala Gin Asn Leu Asp Glu 
65 70 75 80 

ACA CTG GTG CTG TCG AAG TTG AAA GAG ATC CTC GAA GAC CCG TCT TCG 288 
Thr Leu Val Leu Ser Lys Leu Lys Glu He Leu Glu Asp Pro Ser Ser 
85 90 95 

AAG ATT GTG GGT CAG AAC CTG AAG TAC GCG TAC AAG GTT CTT ATG GTA 336 
Lys He Val Gly Gin Asn Leu Lys Tyr Ala Tyr Lys Val Leu Met Val 
100 105 HO 

AAG GGT ATA TCG CCA GTT TAT CCG CAT TTT GAC ACG ATG ATA GCT GCA 3 84 

Lys Gly lie Ser Pro Val Tyr Pro His Phe Asp Thr Met He Ala Ala 
115 120 125 

TAT TTG CTG GAG CCA AAC GAG AAA AAA TTC AAT CTC GAA GAT CTG TCT 432 
Tyr Leu Leu Glu Pro Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 
130 135 140 

TTG AAA TTT CTC GGA TAC AAA ATG ACG TCT TAT CAG GAA CTG ATG TCG 4 80 

Leu Lys Phe Leu Gly Tyr Lys Met Thr Ser Tyr Gin Glu Leu Met Ser 
145 150 155 160 

TTT TCC TCA CCA CTT TTT GGT TTC AGC TTT GCG GAT GTT CCG GTA GAC 52 8 

Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 
165 170 175 

AAG GCT GCG AAC TAC TCC TGC GAG GAT GCA GAC ATC ACT TAT AGG CTC 57 6 

Lys Ala Ala Asn Tyr Ser Cys Glu Asp Ala Asp He Thr Tyr Arg Leu 
180 185 190 

TAC AAG ATA CTC AGC ATG AAG CTC CAT GAA GCG GAA CTT GAG AAC GTC 6 24 
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Tyr Lys lie Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val 
195 200 205 



TTC TAC AGG ATA GAG ATG CCG CTT GTG AAC GTT CTT GCA CGC ATG GAA 672 
Phe Tyr Arg lie Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu 
210 215 220 

TTG AAC GGG GTG TAT GTG GAC ACA GAA TTC CTG AAA AAG CTC TCG GAG 720 
Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu 
225 230 235 240 

GAG TAC GGC AAA AAG CTC GAG GAA CTG GCC GAA AAA ATC TAC CAG ATA 76 8 

Glu Tyr Gly Lys Lys Leu Glu Glu Leu Ala Glu Lys lie Tyr Gin He 
245 250 255 

GCA GGA GAG CCC TTC AAC ATC AAT TCT CCA AAA CAG GTT TCA AAG ATC 816 
Ala Gly Glu Pro Phe Asn He Asn Ser Pro Lys Gin Val Ser Lys He 
260 265 270 

CTT -ppY GAG CTG GGA ATA AAA CCC CGT GGA AAA ACG ACA AAA ACA 864 

Leu Phe Glu Lys Leu Gly He Lys Pro Arg Gly Lys Thr Thr Lys Thr 
275 280 285 

GGA GCG TAC TCT ACC AGG ATA GAG GTG TTG GAA GAG ATA GCG AAT GAG 912 
Gly Ala Tyr Ser Thr Arg He Glu Val Leu Glu Glu He Ala Asn Glu 
290 295 300 

CAC GAG ATA GTA CCC CTC ATT CTC GAG TAC AGA AAG ATC CAG AAA CTG 960 
His Glu He Val Pro Leu He Leu Glu Tyr Arg Lys He Gin Lys Leu 
305 310 315 320 

AAA TCG ACC TAC ATA GAC ACC CTT CCG AAA CTT GTG AAC CCG AAA ACC 1008 
Lys Ser Thr Tyr He Asp Thr Leu Pro Lys Leu Val Asn Pro Lys Thr 
325 330 335 

GGA AGA ATT CAT GCA TCT TTC CAC CAG ACG GGT ACC GCC ACT GGC AGG 1056 
Gly Arg He His Ala Ser Phe His Gin Thr Gly Thr Ala Thr Gly Arg 
340 345 350 

TTG AGT AGC AGT GAT CCA AAT CTT CAG AAT CTT CCG ACA AAG AGC GAA 1104 
Leu Ser Ser Ser Asp Pro Asn Leu Gin Asn Leu Pro Thr Lys Ser Glu 
355 360 365 

GAG GGA AAA GAA ATT AGA AAA GCG ATT GTG CCC CAG GAT CCA GAC TGG 1152 
Glu Gly Lys Glu He Arg Lys Ala He Val Pro Gin Asp Pro Asp Trp 
370 375 380 

TGG ATC GTC AGT GCG GAT TAT TCC CAA ATA GAA CTC AGA ATC CTC GCT 12 00 

Trp He Val Ser Ala Asp Tyr Ser Gin He Glu Leu Arg He Leu Ala 
385 390 395 400 

CAT CTC AGT GGT GAT GAG AAC CTT GTG AAG GCC TTC GAG GAG GGC ATC 124 8 

His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly lie 
405 410 415 

GAT GTG CAC ACC TTG ACT GCC TCC AGG ATC TAC AAC GTA AAG CCA GAA 1296 
Asp Val His Thr Leu Thr Ala Ser Arg He Tyr Asn Val Lys Pro Glu 
420 425 430 

GAA GTG AAC GAA GAA ATG CGA CGG GTT GGA AAG ATG GTG AAC TTC TCT 1344 
Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 
435 440 445 
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ATA ATA TAC GGT GTC ACA CCG TAC GGT CTT TCT GTG AGA CTT GGA ATA 13 92 

He He Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly He 
450 455 460 

CCG GTT AAA GAA GCA GAA AAG ATG ATT ATC AGC TAT TTC ACA CTG TAT 1440 
Pro Val Lys Glu Ala Glu Lys Met He He Ser Tyr Phe Thr Leu Tyr 
465 470 475 480 

CCA AAG GTG CGA AGC TAC ATC CAG GAG GTT GTT GCA GAG GCA AAA GAG 148 8 

Pro Lys Val Arg Ser Tyr He Gin Gin Val Val Ala Glu Ala Lys Glu 
485 490 495 

AAG GGC TAC GTC AGG ACT CTC TTT GGA AGA AAA AGA GAT ATT CCC CAG 1536 
Lys Gly Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg Asp He Pro Gin 
500 505 510 

CTC ATG GCA AGG GAC AAG AAC ACC CAG TCC GAA GGC GAA AGA ATC GCA 1584 
Leu Met Ala Arg Asp Lys Asn Thr Gin Ser Glu Gly Glu Arg He Ala 
515 520 525 

ATA AAC ACC CCC ATT CAG GGA ACG GCG GCA GAT ATA ATA AAA TTG GCT 1632 
He Asn Thr Pro He Gin Gly Thr Ala Ala Asp He He Lys Leu Ala 

530 535 540 

ATG ATA GAT ATA GAC GAG GAG CTG AGA AAA AGA AAC ATG AAA TCC AGA 168 0 

Met He Asp He Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 
545 550 555 560 

ATG ATC ATT CAG GTT CAT GAC GAA CTG GTC TTC GAG GTT CCC GAT GAG 172 8 

Met He He Gin Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu 
565 570 575 

GAA AAA GAA GAA CTA GTT GAT CTG GTG AAG AAC AAA ATG ACA AAT GTG 1776 
Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr Asn Val 
580 585 590 

GTG AAA CTC TCT GTG CCT CTT GAG GTT GAC ATA AGC ATC GGA AAA AGC 1824 
Val Lys Leu Ser Val Pro Leu Glu Val Asp He Ser He Gly Lys Ser 
595 600 605 

TGG TCT TG 183 3 

Trp Ser 
610 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 610 amino acids 

(B) TYPE : amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 5 : 

Met Lys Glu Leu Gin Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 
15 10 15 

He Val Lys Asp His Lys Thr Phe Glu Asp Leu lie Glu Lys Leu Lys 
20 25 30 

Glu Val Pro Ser Phe Ala Leu Ala Leu Glu Thr Ser Ser Leu Asp Pro 
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35 40 45 

Phe Asn Cys Glu lie Val Gly lie Ser Val Ser Phe Lys Pro Lys Thr 
50 55 60 

Ala Tyr Tyr He Pro Leu His His Arg Asn Ala Gin Asn Leu Asp Glu 
65 70 75 80 

Thr Leu Val Leu Ser Lys Leu Lys Glu He Leu Glu Asp Pro Ser Ser 
85 90 95 

Lys He Val Gly Gin Asn Leu Lys Tyr Ala Tyr Lys Val Leu Met Val 
100 105 110 

Lys Gly He Ser Pro Val Tyr Pro His Phe Asp Thr Met He Ala Ala 
115 120 125 

Tyr Leu Leu Glu Pro Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 
130 135 140 

Leu Lys Phe Leu Gly Tyr Lys Met Thr Ser Tyr Gin Glu Leu Met Ser 
145 150 155 160 

Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 
165 170 175 

Lys Ala Ala Asn Tyr Ser Cys Glu Asp Ala Asp He Thr Tyr Arg Leu 
180 185 190 

Tyr Lys He Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val 
195 200 205 

Phe Tyr Arg He Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu 
210 215 220 

Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu 
225 230 235 240 

Glu Tyr Gly Lys Lys Leu Glu Glu Leu Ala Glu Lys He Tyr Gin He 
245 250 255 

Ala Gly Glu Pro Phe Asn He Asn Ser Pro Lys Gin Val Ser Lys He 
260 265 270 

Leu Phe Glu Lys Leu Gly He Lys Pro Arg Gly Lys Thr Thr Lys Thr 
275 280 285 

Gly Ala Tyr Ser Thr Arg He Glu Val Leu Glu Glu He Ala Asn Glu 
290 295 300 

His Glu He Val Pro Leu He Leu Glu Tyr Arg Lys He Gin Lys Leu 
305 310 315 320 

Lys Ser Thr Tyr He Asp Thr Leu Pro Lys Leu Val Asn Pro Lys Thr 
325 330 335 

Gly Arg He His Ala Ser Phe His Gin Thr Gly Thr Ala Thr Gly Arg 
340 345 350 

Leu Ser Ser Ser Asp Pro Asn Leu Gin Asn Leu Pro Thr Lys Ser Glu 
355 360 365 
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Glu Gly Lys Glu He Arg Lys Ala lie Val Pro Gin Asp Pro Asp Trp 



370 



375 



Trp lie Val Ser Ala Asp Tyr Ser Gin lie Glu Leu Arg lie Leu Ala 
385 390 395 

His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly lie 
405 410 

Asp Val His Thr Leu Thr Ala Ser Arg lie Tyr Asn Val Lys Pro Glu 
^ 420 425 430 

Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 



435 



440 445 



lie He Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly He 
450 455 460 

Pro Val Lys Glu Ala Glu Lys Met He lie Ser Tyr Phe Thr Leu Tyr 
465 470 475 

Lys Val Arg Ser Tyr He Gin Gin Val Val Ala Glu Ala Lys Glu 



Pro 



485 



Lys Gly Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg Asp He Pro Gin 



500 



Leu Met Ala Arg Asp Lys Asn Thr Gin Ser Glu Gly Glu Arg He Ala 
515 520 525 

He Asn Thr Pro He Gin Gly Thr Ala Ala Asp He He Lys Leu Ala 



530 



535 



540 



Met lie Asp He Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 
545 550 555 

Met lie He Gin Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu 
565 570 

Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr Asn Val 
580 585 590 

Val Lys Leu Ser Val Pro Leu Glu Val Asp He Ser lie Gly Lys Ser 

600 605 



595 



Trp Ser 
610 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1716 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

Cix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 - - 1716 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: 



ATG CTT GAA ACG TCC TCC CTT GAC CCG TTC AAC TGT GAG ATA GTC GGC 4 6 

Met Leu Glu Thr Ser Ser Leu Asp Pro Phe Asn Cys Glu lie Val Gly 
15 10 15 

ATC TCC GTG TCG TTC AAA CCG AAA ACA GCT TAT TAC ATT CCA CTT CAT 96 
lie Ser Val Ser Phe Lys Pro Lys Thr Ala Tyr Tyr lie Pro Leu His 
20 25 30 

CAC AG A AAC GCC CAG AAT CTT GAT GAA ACA CTG GTG CTG TCG AAG TTG 144 
His Arg Asn Ala Gin Asn Leu Asp Glu Thr Leu Val Leu Ser Lys Leu 
35 40 45 

AAA GAG ATC CTC GAA GAC CCG TCT TCG AAG ATT GTG GGT CAG AAC CTG 192 
Lys Glu lie Leu Glu Asp Pro Ser Ser Lys lie Val Gly Gin Asn Leu 
50 55 60 

AAG TAC GAC TAC AAG GTT CTT ATG GTA AAG GGT ATA TCG CCA GTT TAT 240 
Lys Tyr Asp Tyr Lys Val Leu Met Val Lys Gly He Ser Pro Val Tyr 
65 70 75 80 

CCG CAT TTT GAC ACG ATG ATA GCT GCA TAT TTG CTG GAG CCA AAC GAG 288 
Pro His Phe Asp Thr Met He Ala Ala Tyr Leu Leu Glu Pro Asn Glu 
85 90 95 

AAA AAA TTC AAT CTC GAA GAT CTG TCT TTG AAA TTT CTC GGA TAC AAA 336 
Lys Lys Phe Asn Leu Glu Asp Leu Ser Leu Lys Phe Leu Gly Tyr Lys 
100 105 110 

ATG ACG TCT TAT CAG GAA CTG ATG TCG TTT TCC TCA CCA CTT TTT GGT 3 84 

Met Thr Ser Tyr Gin Glu Leu Met Ser Phe Ser Ser Pro Leu Phe Gly 
115 120 125 

TTC AGC TTT GCG GAT GTT CCG GTA GAC AAG GCT GCG AAC TAC TCC TGC 432 
Phe Ser Phe Ala Asp Val Pro Val Asp Lys Ala Ala Asn Tyr Ser Cys 
130 135 140 

GAG GAT GCA GAC ATC ACT TAT AGG CTC TAC AAG ATA CTC AGC ATG AAG 480 
Glu Asp Ala Asp He Thr Tyr Arg Leu Tyr Lys He Leu Ser Met Lys 
!45 150 155 160 

CTC CAT GAA GCG GAA CTT GAG AAC GTC TTC TAC AGG ATA GAG ATG CCG 528 
Leu His Glu Ala Glu Leu Glu Asn Val Phe Tyr Arg He Glu Met Pro 
165 170 175 

CTT GTG AAC GTT CTT GCA CGC ATG GAA TTG AAC GGG GTG TAT GTG GAC 5 76 

Leu Val Asn Val Leu Ala Arg Met Glu Leu Asn Gly Val Tyr Val Asp 
180 185 190 

ACA GAA TTC CTG AAA AAG CTC TCG GAG GAG TAC GGC AAA AAG CTC GAG 624 
Thr Glu Phe Leu Lys Lys Leu Ser Glu Glu Tyr Gly Lys Lys Leu Glu 
195 200 205 

GAA CTG GCC GAA AAA ATC TAC CAG ATA GCA GGA GAG CCC TTC AAC ATC 672 
Glu Leu Ala Glu Lys He Tyr Gin He Ala Gly Glu Pro Phe Asn lie 
210 215 220 

AAT TCT CCA AAA CAG GTT TCA AAG ATC CTT TTT GAG AAG CTG GGA ATA 720 
Asn Ser Pro Lys Gin Val Ser Lys He Leu Phe Glu Lys Leu Gly He 
225 230 235 240 
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AAA CCC CGT GGA AAA ACG ACA AAA ACA GGA GCG TAC TCT ACC AGG ATA 76 8 

Lys Pro Arg Gly Lys Thr Thr Lys Thr Gly Ala Tyr Ser Thr Arg He 
245 250 255 

GAG GTG TTG GAA GAG ATA GCG AAT GAG CAC GAG ATA GTA CCC CTC ATT 816 
Glu Val Leu Glu Glu He Ala Asn Glu His Glu He Val Pro Leu He 
260 265 270 

CTC GAG TAC AGA AAG ATC GAG AAA CTG AAA TCG ACC TAC ATA GAC ACC 864 
Leu Glu Tyr Arg Lys lie Gin Lys Leu Lys Ser Thr Tyr He Asp Thr 
275 280 285 

CTT CCG AAA CTT GTG AAC CCG AAA ACC GGA AGA ATT CAT GCA TCT TTC 912 
Leu Pro Lys Leu Val Asn Pro Lys Thr Gly Arg He His Ala Ser Phe 
290 295 300 

CAC CAG ACG GGT ACC GCC ACT GGC AGG TTG AGT AGC AGT GAT CCA AAT 96 0 

His Gin Thr Gly Thr Ala Thr Gly Arg Leu Ser Ser Ser Asp Pro Asn 
305 310 315 320 

CTT CAG AAT CTT CCG ACA AAG AGC GAA GAG GGA AAA GAA ATT AGA AAA 1008 
Leu Gin Asn Leu Pro Thr Lys Ser Glu Glu Gly Lys Glu He Arg Lys 

325 330 335 

GCG ATT GTG CCC CAG GAT CCA GAC TGG TGG ATC GTC AGT GCG GAT TAT 1056 
Ala He Val Pro Gin Asp Pro Asp Trp Trp He Val Ser Ala Asp Tyr 
340 345 350 

TCC CAA ATA GAA CTC AGA ATC CTC GCT CAT CTC AGT GGT GAT GAG AAC 1104 
Ser Gin He Glu Leu Arg lie Leu Ala His Leu Ser Gly Asp Glu Asn 
355 360 365 

CTT GTG AAG GCC TTC GAG GAG GGC ATC GAT GTG CAC ACC TTG ACT GCC 1152 
Leu Val Lys Ala Phe Glu Glu Gly He Asp Val His Thr Leu Thr Ala 
370 375 380 

TCC AGG ATC TAC AAC GTA AAG CCA GAA GAA GTG AAC GAA GAA ATG CGA 1200 
Ser Arg lie Tyr Asn Val Lys Pro Glu Glu Val Asn Glu Glu Met Arg 
385 390 395 400 

CGG GTT GGA AAG ATG GTG AAC TTC TCT ATA ATA TAC GGT GTC ACA CCG 1248 
Arg Val Gly Lys Met Val Asn Phe Ser He lie Tyr Gly Val Thr Pro 
405 410 415 

TAC GGT CTT TCT GTG AGA CTT GGA ATA CCG GTT AAA GAA GCA GAA AAG 12 96 

Tyr Gly Leu Ser Val Arg Leu Gly lie Pro Val Lys Glu Ala Glu Lys 
420 425 430 

ATG ATT ATC AGC TAT TTC ACA CTG TAT CCA AAG GTG CGA AGC TAC ATC 1344 
Met He lie Ser Tyr Phe Thr Leu Tyr Pro Lys Val Arg Ser Tyr lie 
435 440 445 

CAG CAG GTT GTT GCA GAG GCA AAA GAG AAG GGC TAC GTC AGG ACT CTC 13 92 

Gin Gin Val Val Ala Glu Ala Lys Glu Lys Gly Tyr Val Arg Thr Leu 
450 455 460 

TTT GGA AGA AAA AGA GAT ATT CCC CAG CTC ATG GCA AGG GAC AAG AAC 144 0 

Phe Gly Arg Lys Arg Asp He Pro Gin Leu Met Ala Arg Asp Lys Asn 
465 470 475 480 

ACC CAG TCC GAA GGC GAA AGA ATC GCA ATA AAC ACC CCC ATT CAG GGA 14 88 

Thr Gin Ser Glu Gly Glu Arg He Ala He Asn Thr Pro He Gin Gly 
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485 



490 



495 



ACG GCG GCA GAT ATA ATA AAA TTG GCT ATG ATA GAT ATA GAC GAG GAG 1536 
Thr Ala Ala Asp lie lie Lys Leu Ala Met He Asp He Asp Glu Glu 
500 505 510 

CTG AGA AAA AGA AAC ATG AAA TCC AGA ATG ATC ATT CAG GTT CAT GAC 1584 
Leu Arg Lys Arg Asn Met Lys Ser Arg Met He He Gin Val His Asp 
515 520 525 

GAA CTG GTC TTC GAG GTT CCC GAT GAG GAA AAA GAA GAA CTA GTT GAT 163 2 

Glu Leu Val Phe Glu Val Pro Asp Glu Glu Lys Glu Glu Leu Val Asp 
530 535 540 

CTG GTG AAG AAC AAA ATG ACA AAT GTG GTG AAA CTC TCT GTG CCT CTT 16 8 0 

Leu Val Lys Asn Lys Met Thr Asn Val Val Lys Leu Ser Val Pro Leu 
545 550 555 560 

GAG GTT GAC ATA AGC ATC GGA AAA AGC TGG TCT TG 1716 
Glu Val Asp He Ser He Gly Lys Ser Trp Ser 
565 570 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 571 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: 

Met Leu Glu Thr Ser Ser Leu Asp Pro Phe Asn Cys Glu He Val Gly 
15 10 15 

He Ser Val Ser Phe Lys Pro Lys Thr Ala Tyr Tyr He Pro Leu His 
20 25 30 

His Arg Asn Ala Gin Asn Leu Asp Glu Thr Leu Val Leu Ser Lys Leu 
35 40 45 

Lys Glu He Leu Glu Asp Pro Ser Ser Lys He Val Gly Gin Asn Leu 
50 55 60 

Lys Tyr Asp Tyr Lys Val Leu Met Val Lys Gly He Ser Pro Val Tyr 
65 70 75 80 

Pro His Phe Asp Thr Met He Ala Ala Tyr Leu Leu Glu Pro Asn Glu 

85 90 95 

Lys Lys Phe Asn Leu Glu Asp Leu Ser Leu Lys Phe Leu Gly Tyr Lys 
100 105 110 

Met Thr Ser Tyr Gin Glu Leu Met Ser Phe Ser Ser Pro Leu Phe Gly 
115 120 125 

Phe Ser Phe Ala Asp Val Pro Val Asp Lys Ala Ala Asn Tyr Ser Cys 
130 135 140 

Glu Asp Ala Asp He Thr Tyr Arg Leu Tyr Lys He Leu Ser Met Lys 
145 150 155 160 
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Leu His Glu Ala Glu Leu Glu Asn Val Phe Tyr Arg lie Glu Met Pro 
165 170 175 

Leu Val Asn Val Leu Ala Arg Met Glu Leu Asn Gly Val Tyr Val Asp 
180 185 190 

Thr Glu Phe Leu Lys Lys Leu Ser Glu Glu Tyr Gly Lys Lys Leu Glu 
195 200 205 

Glu Leu Ala Glu Lys He Tyr Gin He Ala Gly Glu Pro Phe Asn He 
210 215 220 

Asn Ser Pro Lys Gin Val Ser Lys He Leu Phe Glu Lys Leu Gly He 
225 230 235 240 

Lys Pro Arg Gly Lys Thr Thr Lys Thr Gly Ala Tyr Ser Thr Arg He 
245 250 255 

Glu Val Leu Glu Glu lie Ala Asn Glu His Glu He Val Pro Leu He 
260 265 270 

Leu Glu Tyr Arg Lys He Gin Lys Leu Lys Ser Thr Tyr He Asp Thr 

275 280 285 

Leu Pro Lys Leu Val Asn Pro Lys Thr Gly Arg He His Ala Ser Phe 
290 295 300 

His Gin Thr Gly Thr Ala Thr Gly Arg Leu Ser Ser Ser Asp Pro Asn 
305 310 315 320 

Leu Gin Asn Leu Pro Thr Lys Ser Glu Glu Gly Lys Glu He Arg Lys 
325 330 335 

Ala He Val Pro Gin Asp Pro Asp Trp Trp He Val Ser Ala Asp Tyr 
340 345 350 

Ser Gin He Glu Leu Arg He Leu Ala His Leu Ser Gly Asp Glu Asn 
355 360 365 

Leu Val Lys Ala Phe Glu Glu Gly He Asp Val His Thr Leu Thr Ala 
370 375 380 

Ser Arg He Tyr Asn Val Lys Pro Glu Glu Val Asn Glu Glu Met Arg 
385 390 395 400 

Arg Val Gly Lys Met Val Asn Phe Ser He He Tyr Gly Val Thr Pro 
405 410 415 

Tyr Gly Leu Ser Val Arg Leu Gly He Pro Val Lys Glu Ala Glu Lys 
420 425 430 

Met He He Ser Tyr Phe Thr Leu Tyr Pro Lys Val Arg Ser Tyr He 
435 440 445 

Gin Gin Val Val Ala Glu Ala Lys Glu Lys Gly Tyr Val Arg Thr Leu 
450 455 460 

Phe Gly Arg Lys Arg Asp He Pro Gin Leu Met Ala Arg Asp Lys Asn 
465 470 475 480 

Thr Gin Ser Glu Gly Glu Arg He Ala He Asn Thr Pro He Gin Gly 
485 490 495 
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Thr Ala Ala Asp He He Lys Leu Ala Met He Asp He Asp Glu Glu 
500 505 510 

Leu Arg Lys Arg Asn Met Lys Ser Arg Met He He Gin Val His Asp 
515 520 525 

Glu Leu Val Phe Glu Val Pro Asp Glu Glu Lys Glu Glu Leu Val Asp 
530 535 540 

Leu Val Lys Asn Lys Met Thr Asn Val Val Lys Leu Ser Val Pro Leu 
545 550 555 560 

Glu Val Asp He Ser He Gly Lys Ser Trp Ser 
565 570 

(2) INFORMATION FOR SEQ ID NO: 38: 

(i} SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1465 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 . . 1485 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

ATG AAG GAA CTT CAA CTG TAC GAA GAA GCA GAA CCC ACC GGA TAC GAA 4 8 

Met Lys Glu Leu Gin Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 
15 10 15 

ATC GTG AAG GAT CAT AAG ACC TTC GAA GAT CTG TCT TTG AAA TTT CTC 96 
He Val Lys Asp His Lys Thr Phe Glu Asp Leu Ser Leu Lys Phe Leu 
20 25 30 

GGA TAC AAA ATG ACG TCT TAT GAG GAA CTG ATG TCG TTT TCC TCA CCA 144 
Gly Tyr Lys Met Thr Ser Tyr Gin Glu Leu Met Ser Phe Ser Ser Pro 
35 40 45 

CTT TTT GGT TTC AGC TTT GCG GAT GTT CCG GTA GAC AAG GCT GCG AAC 192 
Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp Lys Ala Ala Asn 
50 55 go 

TAC TCC TGC GAG GAT GCA GAC ATC ACT TAT AGG CTC TAC AAG ATA CTC 24 0 

Tyr Ser Cys Glu Asp Ala Asp He Thr Tyr Arg Leu Tyr Lys He Leu 
65 70 75 80 

AGC ATG AAG CTC CAT GAA GCG GAA CTT GAG AAC GTC TTC TAC AGG ATA 28 8 

Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val Phe Tyr Arg He 
85 90 95 

GAG ATG CCG CTT GTG AAC GTT CTT GCA CGC ATG GAA TTG AAC GGG GTG 3 36 

Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu Leu Asn Gly Val 
100 105 no 

TAT GTG GAC ACA GAA TTC CTG AAA AAG CTC TCG GAG GAG TAC GGC AAA 3 84 

Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu Glu Tyr Gly Lys 
115 120 125 
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AAG CTC GAG GAA CTG GCC GAA AAA ATC TAC CAG ATA GCA GGA GAG CCC 43 2 

Lys Leu Glu Glu Leu Ala Glu Lys He Tyr Gin He Ala Gly Glu Pro 
130 135 140 

TTC AAC ATC AAT TCT CCA AAA CAG GTT TCA AAG ATC CTT TTT GAG AAG 4 80 

Phe Asn He Asn Ser Pro Lys Gin Val Ser Lys He Leu Phe Glu Lys 
145 150 155 160 

CTG GGA ATA AAA CCC CGT GGA AAA ACG ACA AAA ACA GGA GCG TAC TCT 52 8 

Leu Gly He Lys Pro Arg Gly Lys Thr Thr Lys Thr Gly Ala Tyr Ser 
165 170 175 

ACC AGG ATA GAG GTG TTG GAA GAG ATA GCG AAT GAG CAC GAG ATA GTA 576 
Thr Arg He Glu Val Leu Glu Glu He Ala Asn Glu His Glu He Val 
180 185 190 

CCC CTC ATT CTC GAG TAC AGA AAG ATC CAG AAA CTG AAA TCG ACC TAC 624 
Pro Leu He Leu Glu Tyr Arg Lys He Gin Lys Leu Lys Ser Thr Tyr 
195 200 205 

ATA GAC ACC CTT CCG AAA CTT GTG AAC CCG AAA ACC GGA AGA ATT CAT 672 
He Asp Thr Leu Pro Lys Leu Val Asn Pro Lys Thr Gly Arg He His 
210 215 220 

GCA TCT TTC CAC CAG ACG GGT ACC GCC ACT GGC AGG TTG AGT AGC AGT 72 0 

Ala Ser Phe His Gin Thr Gly Thr Ala Thr Gly Arg Leu Ser Ser Ser 
225 230 235 240 

GAT CCA AAT CTT CAG AAT CTT CCG ACA AAG AGC GAA GAG GGA AAA GAA 76 8 

Asp Pro Asn Leu Gin Asn Leu Pro Thr Lys Ser Glu Glu Gly Lys Glu 
245 250 255 

ATT AGA AAA GCG ATT GTG CCC CAG GAT CCA GAC TGG TGG ATC GTC AGT 816 
He Arg Lys Ala He Val Pro Gin Asp Pro Asp Trp Trp He Val Ser 
260 265 270 

GCG GAT TAT TCC CAA ATA GAA CTC AGA ATC CTC GCT CAT CTC AGT GGT 864 
Ala Asp Tyr Ser Gin He Glu Leu Arg He Leu Ala His Leu Ser Gly 
275 280 285 

GAT GAG AAC CTT GTG AAG GCC TTC GAG GAG GGC ATC GAT GTG CAC ACC 912 
Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly He Asp Val His Thr 
290 295 300 

TTG ACT GCC TCC AGG ATC TAC AAC GTA AAG CCA GAA GAA GTG AAC GAA 96 0 

Leu Thr Ala Ser Arg He Tyr Asn Val Lys Pro Glu Glu Val Asn Glu 
305 310 315 320 

GAA ATG CGA CGG GTT GGA AAG ATG GTG AAC TTC TCT ATA ATA TAC GGT 1008 
Glu Met Arg Arg Val Gly Lys Met Val Asn Phe Ser lie He Tyr Gly 
325 330 335 

GTC ACA CCG TAC GGT CTT TCT GTG AGA CTT GGA ATA CCG GTT AAA GAA 10 56 

Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly lie Pro Val Lys Glu 
340 345 350 

GCA GAA AAG ATG ATT ATC AGC TAT TTC ACA CTG TAT CCA AAG GTG CGA 1104 
Ala Glu Lys Met He He Ser Tyr Phe Thr Leu Tyr Pro Lys Val Arg 
355 360 365 

AGC TAC ATC CAG CAG GTT GTT GCA GAG GCA AAA GAG AAG GGC TAC GTC 1152 
Ser Tyr He Gin Gin Val Val Ala Glu Ala Lys Glu Lys Gly Tyr Val 
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370 37S 380 

AGG ACT CTC TTT GGA AGA AAA AGA GAT ATT CCC CAG CTC ATG GCA AGG 12 00 

Arg Thr Leu Phe Gly Arg Lys Arg Asp lie Pro Gin Leu Met Ala Arg 
385 390 395 400 

GAC AAG AAC ACC CAG TCC GAA GGC GAA AGA ATC GCA ATA AAC ACC CCC 124 8 

Asp Lys Asn Thr Gin Ser Glu Gly Glu Arg lie Ala He Asn Thr Pro 
405 410 415 

ATT CAG GGA ACG GCG GCA GAT ATA ATA AAA TTG GCT ATG ATA GAT ATA 12 96 

He Gin Gly Thr Ala Ala Asp He He Lys Leu Ala Met He Asp He 
420 425 430 

GAC GAG GAG CTG AGA AAA AGA AAC ATG AAA TCC AGA ATG ATC ATT CAG 134 4 

Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg Met He He Gin 
435 440 445 

GTT CAT GAC GAA CTG GTC TTC GAG GTT CCC GAT GAG GAA AAA GAA GAA 13 92 

Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu Glu Lys Glu Glu 
450 455 460 

CTA GTT GAT CTG GTG AAG AAC AAA ATG ACA AAT GTG GTG AAA CTC TCT 144 0 

Leu Val Asp Leu Val Lys Asn Lys Met Thr Asn Val Val Lys Leu Ser 
465 470 475 480 

GTG CCT CTT GAG GTT GAC ATA AGC ATC GGA AAA AGC TGG TCT TG 1485 
Val Pro Leu Glu Val Asp He Ser lie Gly Lys Ser Trp Ser 

485 490 495 

(2) INFORMATION FOR SEQ ID NO:39: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 494 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

Met Lys Glu Leu Gin Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 
1 5 10 15 

He Val Lys Asp His Lys Thr Phe Glu Asp Leu Ser Leu Lys Phe Leu 
20 25 30 

Gly Tyr Lys Met Thr Ser Tyr Gin Glu Leu Met Ser Phe Ser Ser Pro 
35 40 45 

Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp Lys Ala Ala Asn 
50 55 60 

Tyr Ser Cys Glu Asp Ala Asp He Thr Tyr Arg Leu Tyr Lys He Leu 
65 70 75 80 

Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val Phe Tyr Arg He 
85 90 95 

Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu Leu Asn Gly Val 
100 105 HO 
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Tyr Val Asp Thr Glu Phe 

115 



Lys Leu 
130 

Phe Asn 
145 

Leu Gly 
Thr Arg 



Glu Glu Leu Ala 



He Asn Ser Pro 
150 

He Lys Pro Arg 
165 

He Glu Val Leu 

180 



Pro Leu He Leu Glu Tyr 

195 



Leu Lys Lys Leu Ser Glu Glu Tyr 
120 125 

Glu Lys He Tyr Gin He Ala Gly 
135 140 

Lys Gin Val Ser Lys He Leu Phe 
155 

Gly Lys Thr Thr Lys Thr Gly Ala 
170 

Glu Glu He Ala Asn Glu His Glu 
165 190 

Arg Lys He Gin Lys Leu Lys Ser 
200 205 



He Asp 
210 

Ala Ser 

225 

Asp Pro 

He Arg 

Ala Asp 

Asp Glu 
290 

Leu Thr 
305 

Glu Met 
Val Thr 
Ala Glu 



Ser Tyr 

370 



Thr Leu Pro Lys Leu Val Asn Pro Lys Thr Gly Arg 
215 220 



Phe His Gin Thr 

230 

Asn Leu Gin Asn 
245 

Lys Ala lie Val 
260 

Tyr Ser Gin He 
275 



Gly Thr Ala Thr Gly Arg Leu Ser 

235 

Leu Pro Thr Lys Ser Glu Glu Gly 
250 

Pro Gin Asp Pro Asp Trp Trp He 

265 270 

Glu Leu Arg He Leu Ala His Leu 
280 285 



Asn Leu Val Lys Ala Phe Glu Glu Gly He Asp Val 
295 300 



Ala Ser Arg He 
310 

Arg Arg Val Gly 
325 

Pro Tyr Gly Leu 
340 

Lys Met He He 

355 

He Gin Gin Val 



Tyr Asn Val Lys Pro Glu Glu Val 
315 

Lys Met Val Asn Phe Ser He He 

330 

Ser Val Arg Leu Gly He Pro Val 
345 350 

Ser Tyr Phe Thr Leu Tyr Pro Lys 
360 365 

Val Ala Glu Ala Lys Glu Lys Gly 
375 380 



Gly Lys 

Glu Pro 

Glu Lys 
160 

Tyr Ser 
175 

He Val 

Thr Tyr 

He His 

Ser Ser 
240 

Lys Glu 
255 

Val Ser 

Ser Gly 

His Thr 

Asn Glu 
320 

Tyr Gly 
335 

Lys Glu 
Val Arg 
Tyr Val 



Arg Thr 

385 

Asp Lys 
He Gin 
Asp Glu 



Leu Phe Gly Arg 
390 

Asn Thr Gin Ser 
405 

Gly Thr Ala Ala 
420 

Glu Leu Arg Lys 
435 



Lys Arg Asp He Pro Gin Leu Met Ala Arg 
395 400 

Glu Gly Glu Arg He Ala He Asn Thr Pro 
410 415 

Asp He He Lys Leu Ala Met He Asp He 
425 430 

Arg Asn Met Lys Ser Arg Met He He Gin 
440 445 
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Val His Asp Glu Leu Val 
450 



Phe Glu Val 
455 



Pro 



Asp Glu Glu Lys Glu Glu 
460 



Leu Val Asp Leu Val Lys 
465 470 



Asn hys Met 



Thr 



Asn Val Val Lys Leu Ser 
475 480 



Val Pro Leu Glu Val Asp 
485 



lie Ser He 



Gly 



490 



Lys Ser Trp Ser 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
CGCCAGGGTT TTCCCAGTCA CGAC 24 
(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 74 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 
ATAAGCGCCA TTGATGTTCC TCTCTACTCG AAAGTTAGAG AGGACACACC CGATCCCTAT 60 
AGTGAGTCGT ATTA 74 
(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
TAATACGACT CACTATAGGG CGAAT 2 5 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
GAATCGTCGT ATGCAGTGAA AACTC 
(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii ) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

CTTGATTGAC AAGGATGGAT GGCTA 

(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:45: 

CATGGTTTAA ATCCTGTGTG AAATTGTTAT CCG 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.-46: 
CGGATAACAA TTTCACACAG GATTTAAAC 
(2) INFORMATION FOR SEQ ID NO:47: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:47: 
TAATACGACT CACTATAGGG CGAAT 
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(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:48: 

CATGCCATGG CATG CATTT A CGTTGACACC A 

(2) INFORMATION FOR SEQ ID NO;49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

TCCCCCGGGT TGCGCTCACT GCCCGCTTTC CAGT 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

AGCTTATCGA TGGCACTTTT CGGGGAAATG TGCG 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 
AGCTTATCGA TAAGCGATGC CGGGAGCAGA CAAGC 
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